Proceedings of the 2018 Web Conference Companion (WWW'18), 2018
Green spaces are believed to improve the well-being of users in urban areas. While there are urba... more Green spaces are believed to improve the well-being of users in urban areas. While there are urban research exploring the emotional benefits of green spaces, these works are based on user surveys and case studies, which are typically small in scale, intrusive, time-intensive and costly. In contrast to earlier works, we utilize a non-intrusive methodology to understand green space effects at large-scale and in greater detail, via digital traces left by Twitter users. Using this methodology, we perform an empirical study on the effects of green spaces on user sentiments and emotions in Melbourne, Australia and our main findings are: (i) tweets in green spaces evoke more positive and less negative emotions, compared to those in urban areas; (ii) each season affects various emotion types differently; (iii) there are interesting changes in sentiments based on the hour, day and month that a tweet was posted; and (iv) negative sentiments are typically associated with large transport infrastructures such as train interchanges, major road junctions and railway tracks. The novelty of our study is the combination of psychological theory, alongside data collection and analysis techniques on a large-scale Twitter dataset, which overcomes the limitations of traditional methods in urban research.
Proceedings of the 2018 European Conference on Machine Learning and Knowledge Discovery in Databases (ECML-PKDD'18), 2018
Twitter is a popular social networking site that generates a large volume and variety of tweets, ... more Twitter is a popular social networking site that generates a large volume and variety of tweets, thus a key challenge is to filter and track relevant tweets and identify the main topics discussed in real-time. For this purpose, we developed the Real-time Analytics Platform for Interactive Data mining (RAPID) system, which provides an effective data collection mechanism through query expansion, numerous analysis and visualization capabilities for understanding user interactions, tweeting behaviours, discussion topics, and other social patterns.
Tour recommendation and itinerary planning are challenging tasks for tourists, due to their need ... more Tour recommendation and itinerary planning are challenging tasks for tourists, due to their need to select Points of Interest (POI) to visit in unfamiliar cities, and to select POIs that align with their interest preferences and trip constraints. We propose an algorithm called PersTour for recommending personalized tours using POI popularity and user interest preferences, which are automatically derived from real-life travel sequences based on geo-tagged photos. Our tour recommendation problem is modelled using a formulation of the Orienteering problem, and considers user trip constraints such as time limits and the need to start and end at specific POIs. In our work, we also reflect levels of user interest based on visit durations, and demonstrate how POI visit duration can be personalized using this time-based user interest. Furthermore , we demonstrate how PersTour can be further enhanced by: (i) a weighted updating of user interests based on the recency of their POI visits; and (ii) an automatic weighting between POI popularity and user interests based on the tourist's activity level. Using a Flickr dataset of ten cities, our experiments show the effectiveness of PersTour against various collaborative filtering and greedy-based baselines, in terms of tour popularity, interest, recall, precision and F1-score. In particular, our results show the merits of using time-based user interest and personalized POI visit durations, compared to the current practice of using frequency-based user interest and average visit durations.
Proceedings of the 2018 Web Conference Companion (WWW'18), 2018
Studying large, widely spread Twitter data has laid the foundation for many novel applications fr... more Studying large, widely spread Twitter data has laid the foundation for many novel applications from predicting natural disasters and epidemics to understanding urban dynamics. Recent studies have fo-cused on exploring people's emotional response to their urban environment , e.g., green spaces versus built up areas, through analysing the sentiment of tweets within that area. Since green spaces have the capacity to improve citizen's well-being, we developed a system that is capable of recommending green spaces to users. Our system is unique in the sense that the recommendations are tailored with regard to users' preferred activity as well as the degree of positive sentiments in each green space. We show that the incoming flow of tweets can be used to refine the recommendations over time. Furthermore, We implemented a web-based, user-friendly interface to solicit user inputs and display recommendation results.
Proceedings of the 7th International Workshop on New Frontiers in Mining Complex Patterns (NFMCP'18), 2018
Twitter is increasingly used for political, advertising and marketing campaigns, where the main a... more Twitter is increasingly used for political, advertising and marketing campaigns, where the main aim is to influence users to support specific causes, individuals or groups. We propose a novel methodology for mining and analyzing Twitter campaigns, which includes: (i) collecting tweets and detecting topics relating to a campaign; (ii) mining important campaign topics using scientometrics measures; (iii) modelling user interests using hashtags and topical entropy; (iv) identifying influential users using an adapted PageRank score; and (v) various met-rics and visualization techniques for identifying bot-like activities. While this methodology is generalizable to multiple campaign types, we demonstrate its effectiveness on the 2017 German federal election.
Proceedings of the 2018 Web Conference Companion (WWW'18), 2018
Travelling and touring are popular leisure activities enjoyed by millions of tourists around the ... more Travelling and touring are popular leisure activities enjoyed by millions of tourists around the world. However, the task of travel itinerary recommendation and planning is tedious and challenging for tourists, who are often unfamiliar with the various Points-of-Interest (POIs) in a city. Apart from identifying popular POIs, the tourist needs to construct a travel itinerary comprising a subset of these POIs, and to order these POIs as a sequence of visits that can be completed within his/her available touring time. For a more realistic itinerary, the tourist also has to account for travelling time between POIs and visiting times at individual POIs. Furthermore, this itinerary should incorporate tourist preferences such as desired starting and ending POIs (e.g., POIs that are near the tourist's hotel) and a subset of must-see POIs (e.g., popular POIs that a tourist must visit). We term this the TourMustSee problem, which is based on a variant of the Orienteering problem. Following which, we propose the LP+M algorithm for solving the TourMustSee problem as an Integer Linear Program (ILP). Using a Flickr dataset of POI visits in seven touristic cities, we compare LP+M against various ILP-based baselines, and the results show that LP+M recommends better travel itineraries in terms of POI popularity, total POIs visited, total touring time utilized and must-visit POI(s) inclusion.
Tourism is a popular leisure activity and an important industry, where the main task involves vis... more Tourism is a popular leisure activity and an important industry, where the main task involves visiting unfamiliar Places-of-Interest (POI) in foreign cities. Recommending POIs and tour planning are challenging and time-consuming tasks for tourists due to: (i) the need to identify and recommend captivating POIs in an unfamiliar city; (ii) having to schedule POI visits as a connected itinerary that satisfies trip constraints such as starting/ending near a specific location (e.g., the tourist's hotel) and completing the itinerary within a limited touring duration; and (iii) having to satisfy the diverse interest preferences of each unique tourist. While tourism-related information can be obtained from the Internet, travel guides and tour agencies, many of these resources simply recommend individual POIs or popular itineraries, but otherwise do not appeal to the interest preferences of users or adhere to their trip constraints.
In contrast to existing works on next-POI prediction and top-k POI recommendation that recommend a single POI or a ranked list of POIs, the task of tour recommendation involves the need to identify a set of interesting POIs and schedule them as an itinerary with various time and space constraints. While there are works on path planning that recommend an itinerary, this itinerary is typically optimized based on a global utility such as POI popularity, and thus offer no personalization for a tourist based on his/her interest preferences. This thesis addresses the challenges associated with the automation and personalization of tour recommendation using data mining techniques to model user interest and POI-related information, and using optimization problems and techniques to formulate and solve more realistic tour recommendation problems. Our main contributions include: 1.) Proposing and implementing a framework that utilizes Flickr geo-tagged photos and Wikipedia to automatically determine user trajectories, interest preferences and POI-related information such as POI popularity and visiting times. 2.) Proposing the PersTour algorithm for recommending personalized tour itineraries based on POI popularity, users' interest preferences and trip constraints, where POI visit durations are customized based on user interests. 3.) Formulating the QueueTourRec problem for recommending queue-aware and personalized itineraries that schedule visits to popular and interesting POIs at times with minimal queuing times, and proposing a novel implementation of Monte Carlo Tree Search to solve this problem. 4.) Developing the TourRecInt algorithm for tour recommendation based on a variant of the Orienteering problem with a mandatory POI category, which is defined as the POI category that a tourist has most frequently visited. 5.) Formulating and solving the novel GroupTourRec problem, which involves recommending tour itineraries to groups of tourists with diverse interests and assigning tour guides with the right expertise to lead each tour group. 6.) Illustrating the application of our proposed approach in practice, by presenting a web-based system implementation of our PersTour algorithm, with the front-end component developed using HTML, PHP, jQuery and the Google Maps API, and the back-end based on Python, Java and PHP.
Proceedings of the 2017 IEEE International Conference on Big Data (BigData'17), 2017
Twitter is a popular microblogging service, where users frequently engage in discussions about va... more Twitter is a popular microblogging service, where users frequently engage in discussions about various topics of interest, ranging from popular topics (e.g., music) to niche topics (e.g., politics). With the large amount of tweets, a key challenge is to automatically model and determine the discussion topics without having prior knowledge of the types and number of topics, or requiring the technical expertise to define various algorithmic parameters. For this purpose, we propose the Clustering-based Topic Modelling (ClusTop) algorithm that constructs various types of word network and automatically determines the discussion topics using community detection approaches. Unlike traditional topic models, ClusTop is able to automatically determine the appropriate number of topics and does not require numerous parameters to be set. The ClusTop algorithm is also able to capture the syntactic meaning in tweets via the use of bigrams, trigrams and other word combinations in constructing the word network graph. Using three Twitter datasets with labelled crises and events as topics, ClusTop has been shown to outperform various baselines in terms of topic coherence, pointwise mutual information, precision, recall and F-score.
Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'17), 2017
Personalized itinerary recommendation is a complex and time-consuming problem, due to the need to... more Personalized itinerary recommendation is a complex and time-consuming problem, due to the need to recommend popular attractions that are aligned to the interest preferences of a tourist, and to plan these attraction visits as an itinerary that has to be completed within a specific time limit. Furthermore, many existing itinerary recommendation systems do not automatically determine and consider queuing times at attractions in the recommended itinerary, which varies based on the time of visit to the attraction, e.g., longer queuing times at peak hours. To solve these challenges, we propose the PersQ algorithm for recommending personalized itineraries that take into consideration attraction popularity, user interests and queuing times. We also implement a framework that utilizes geo-tagged photos to derive attraction popularity, user interests and queuing times, which PersQ uses to recommend personalized and queue-aware itineraries. We demonstrate the effectiveness of PersQ in the context of five major theme parks, based on a Flickr dataset spanning nine years. Experimental results show that PersQ outperforms various state-of-the-art baselines, in terms of various queuing-time related metrics, itinerary popularity, user interest alignment, recall, precision and F1-score.
Proceedings of the 2017 IEEE International Conference on Big Data (BigData'17), 2017
Topic modelling is a well-studied field that aims to identify topics from traditional documents s... more Topic modelling is a well-studied field that aims to identify topics from traditional documents such as news articles and reports. More recently, Latent Dirichlet Allocation (LDA) and its variants, have been applied on social media platforms to model and study topics relating to sports, politics and companies. While these applications were able to successfully identify the general topics, we posit that standard LDA can be augmented with spatial and temporal considerations based on the geo-coordinates and timestamps of social media posts. Towards this effort, we propose a spatial and temporal variant of LDA to better detect more specific topics, such as a particular art exhibit held at a museum or a security incident happening on a particular day. We validate our approach on a Twitter dataset and find that the detected topics are well-aligned to real-life events happening on the specific days and locations.
Extended Proceedings of the 24th Conference on User Modeling, Adaptation and Personalization (UMAP'16), Doctoral Consortium, 2016
Travel itinerary recommendation is an important but challenging problem, due to the need to recom... more Travel itinerary recommendation is an important but challenging problem, due to the need to recommend captivating Places-of-Interest (POI) and construct these POIs as a connected itinerary. Another challenge is to personalize these recommended itineraries based on tourist interests and their preferences for starting/ending POIs and time/distance budgets. Our work aims to address these challenges by proposing algorithms to recommend personalized travel itineraries for both individuals and groups of tourists, based on their interest preferences. To determine these interests, we first construct tourists' past POI visits based on their geo-tagged photos and then build a model of user interests based on their time spent visiting each POI. Experimental evaluation on a Flickr dataset of multiple cities show that our proposed algorithms out-perform various baselines in terms of recall, precision, F1-score and other heuristics-based metrics.
Proceedings of the 26th International Conference on Automated Planning and Scheduling (ICAPS'16), Doctoral Consortium, 2016
Trip planning is both challenging and tedious for tourists due to their unique interest preferenc... more Trip planning is both challenging and tedious for tourists due to their unique interest preferences and various trip constraints. Despite the availability of online resources for tour planning and services provided by tour agencies, there are various challenges such as: (i) selecting POIs that are personalized to the unique interests of individual travellers; (ii) constructing these POIs as an itinerary, with considerations for time availability and starting/ending place preferences (e.g., near a tourist's hotel); (iii) for tour agencies to group tourists into tour groups such that the recommended tour appeals to the interests of the group as a whole; and (iv) similarly, for tour agencies to assign tour guides with the right expertise to lead each of these tour groups. In our work, we aim to develop algorithms for recommending personalized tours to both individual travellers and groups of tourists, based on their interest preferences , which we automatically determine based on geo-tagged photos posted by these tourists. Using a Flickr dataset of geo-tagged photos as ground-truth for real-life POI visits in multiple cities, we evaluate our proposed algorithms using various metrics such as precision , recall, F1-score, user interest scores and POI popularity , among others.
Proceedings of the 2nd Workshop on Noisy User-generated Text (WNUT'16), 2016
Knowing the location of a social media user and their posts is important for various purposes, su... more Knowing the location of a social media user and their posts is important for various purposes, such as the recommendation of location-based items/services, and locality detection of cri-sis/disasters. This paper describes our submission to the shared task "Geolocation Prediction in Twitter" of the 2nd Workshop on Noisy User-generated Text. In this shared task, we propose an algorithm to predict the location of Twitter users and tweets using a multinomial Naive Bayes classifier trained on Location Indicative Words and various textual features (such as city/country names, #hashtags and @mentions). We compared our approach against various baselines based on Location Indicative Words, city/country names, #hashtags and @mentions as individual feature sets, and experimental results show that our approach outperforms these baselines in terms of classification accuracy, mean and median error distance.
Proceedings of the 27th ACM Conference on Hypertext and Social Media (HT'16), 2016
Authority users often play important roles in a social system. They are expected to write good re... more Authority users often play important roles in a social system. They are expected to write good reviews at product review sites; provide high quality answers in question answering systems; and share interesting content in social networks. In the context of marketing and advertising, knowing how users react to emails and messages from authority senders is important, given the prevalence of email in our everyday life. Using a real-life academic event, we designed and conducted an online controlled experiment to determine how email senders of different types of authority (depart-ment head, event organizer and a general email account) affect the range of response behavior of recipients, which includes opening the email, browsing the event website, and registering for the event. In addition, we proposed a systematic approach to analyze the user response behavior to email campaigns from the time the user receives the email till he/she browses the website in a seamless manner.
Proceedings of the 25th ACM International Conference on Information and Knowledge Management (CIKM'16), 2016
There has been a growing interest in recommending trips for tourists using location-based social ... more There has been a growing interest in recommending trips for tourists using location-based social networks. The challenge of trip recommendation not only lies in searching for relevant points-of-interest (POIs) to form a personalized trip, but also selecting the best time of day to visit the POIs. Popular POIs can be too crowded during peak times, resulting in long queues and delays. In this work, we propose the Personalized Crowd-aware Trip Recommendation (PersCT) algorithm to recommend personalized trips that also avoid the most crowded times of the POIs. We model the problem as an extension of the Orienteering Problem with multiple constraints. We extract user interests by collaborative filtering and we propose an extension of the Ant Colony Optimisation algorithm to merge user interests with POI popularity and crowdedness data to recommend trips. We evaluate our algorithm using foot traffic information obtained from a real-life pedestrian sensor dataset and user travel histories extracted from a Flickr photo dataset. We show that our algorithm out-performs several benchmarks in achieving a balance between conflicting objectives by satisfying user interests while reducing the crowdedness of the trips.
Extended Proceedings of the 27th ACM Conference on Hypertext and Social Media (HT'16), 2016
Touring is a popular but time-consuming activity, due to the need to identify interesting attract... more Touring is a popular but time-consuming activity, due to the need to identify interesting attractions or Places-of-Interest (POIs) and structure these POIs in the form of a time-constrained tour itinerary. To solve this challenge, we propose the Personalized Tour Recommendation and Planning (PersTour) system. The PersTour system is able to plan for a customized tour itinerary where the recommended POIs and visit durations are personalized based on the tourist's interest preferences. In addition, tourists have the option to indicate their trip constraints (e.g., a preferred start-ing/ending location and a specific tour duration) to further customize their tour itinerary.
Proceedings of the 26th International Conference on Automated Planning and Scheduling (ICAPS'16), 2016
Recommending and planning tour itineraries are challenging and time-consuming for tourists, hence... more Recommending and planning tour itineraries are challenging and time-consuming for tourists, hence they may seek tour operators for help. Traditionally tour operators have offered standard tour packages of popular locations, but these packages may not cater to tourist's interests. In addition, tourists may want to travel in a group, e.g., extended family, and want an operator to help them. We introduce the novel problem of group tour recommendation (GROUPTOURREC), which involves many challenges: forming tour groups whose members have similar interests; recommending Points-of-Interests (POI) that form the tour itinerary and cater for the group's interests; and assigning guides to lead these tours. For each challenge, we propose solutions involving: clustering for tourist groupings; optimizing a variant of the Orienteering problem for POI recommendations ; and integer programming for tour guide assignments. Using a Flickr dataset of seven cities, we compare our proposed approaches against various base-lines and observe significant improvements in terms of interest similarity, total/maximum/minimum tour interests and total tour guide expertise.
The immense popularity and rapid growth of Online Social Networks (OSN) have attracted the intere... more The immense popularity and rapid growth of Online Social Networks (OSN) have attracted the interest of researchers and companies, particularly in how users group together to form communities online. While many community detection algorithms have been developed to detect communities on such OSNs, most of these algorithms are based only on topological links and researchers have observed that many topological links do not translate to actual user interaction. As such, many members of the detected communities do not communicate frequently to each other. This inactivity creates a problem in targeted advertising and viral marketing, which require the community to be highly active so as to facilitate the diffusion of product/service information. We propose an approach to detect highly interactive Twitter communities that share common interests, based on the frequency and patterns of direct tweeting among users, rather than the topological information implicit in follower/following links. Our experimental results show that communities detected by our proposed approach are more cohesive and connected within different interest groups, based on topological measures. We also show that the detected communities actively interact about the specific interests, based on the high frequency of #hashtags and @mentions related to this interest. In addition, we study the trends in their tweeting patterns such as how they follow and unfollow other users, and observe that our approach detects communities comprising users whose links are more persistent compared to those in other groups of users.
Proceedings of the 24th International Joint Conference on Artificial Intelligence (IJCAI'15), Jul 2015
Tour recommendation and itinerary planning are challenging tasks for tourists, due to their need ... more Tour recommendation and itinerary planning are challenging tasks for tourists, due to their need to select Points of Interest (POI) to visit in unfamiliar cities, and to select POIs that align with their interest preferences and trip constraints. We propose an algorithm called PersTour for recommending personalized tours using POI popularity and user interest preferences, which are automatically derived from real-life travel sequences based on geo-tagged photos. Our tour recommendation problem is modelled using a formulation of the Orienteering problem, and considers user trip constraints such as time limits and the need to start and end at specific POIs. In our work, we also reflect levels of user interest based on visit durations, and demonstrate how POI visit duration can be personalized using this time-based user interest. Using a Flickr dataset of four cities, our experiments show the effectiveness of PersTour against various baselines, in terms of tour popularity, interest, recall, precision and F1-score. In particular, our results show the merits of using time-based user interest and personalized POI visit durations, compared to the current practice of using frequency-based user interest and average visit durations.
Proceedings of the 2015 SIGMOD PhD Symposium (SIGMOD'15), May 2015
Photo sharing sites like Flickr and Instagram have grown increasingly popular in recent years, re... more Photo sharing sites like Flickr and Instagram have grown increasingly popular in recent years, resulting in a large amount of uploaded photos. In addition, these photos contain useful meta-data such as the taken time and geo-location. Using such geo-tagged photos and Wikipedia, we propose an approach for recommending tours based on user interests from his/her visit history. We evaluate our proposed approach on a Flickr dataset comprising three cities and find that our approach is able to recommend tours that are more popular and comprise more places/points-of-interest, compared to various baselines. More importantly, we find that our recommended tours reflect the ground truth of real-life tours taken by users, based on measures of recall, precision and F1-score.
Proceedings of the 2018 Web Conference Companion (WWW'18), 2018
Green spaces are believed to improve the well-being of users in urban areas. While there are urba... more Green spaces are believed to improve the well-being of users in urban areas. While there are urban research exploring the emotional benefits of green spaces, these works are based on user surveys and case studies, which are typically small in scale, intrusive, time-intensive and costly. In contrast to earlier works, we utilize a non-intrusive methodology to understand green space effects at large-scale and in greater detail, via digital traces left by Twitter users. Using this methodology, we perform an empirical study on the effects of green spaces on user sentiments and emotions in Melbourne, Australia and our main findings are: (i) tweets in green spaces evoke more positive and less negative emotions, compared to those in urban areas; (ii) each season affects various emotion types differently; (iii) there are interesting changes in sentiments based on the hour, day and month that a tweet was posted; and (iv) negative sentiments are typically associated with large transport infrastructures such as train interchanges, major road junctions and railway tracks. The novelty of our study is the combination of psychological theory, alongside data collection and analysis techniques on a large-scale Twitter dataset, which overcomes the limitations of traditional methods in urban research.
Proceedings of the 2018 European Conference on Machine Learning and Knowledge Discovery in Databases (ECML-PKDD'18), 2018
Twitter is a popular social networking site that generates a large volume and variety of tweets, ... more Twitter is a popular social networking site that generates a large volume and variety of tweets, thus a key challenge is to filter and track relevant tweets and identify the main topics discussed in real-time. For this purpose, we developed the Real-time Analytics Platform for Interactive Data mining (RAPID) system, which provides an effective data collection mechanism through query expansion, numerous analysis and visualization capabilities for understanding user interactions, tweeting behaviours, discussion topics, and other social patterns.
Tour recommendation and itinerary planning are challenging tasks for tourists, due to their need ... more Tour recommendation and itinerary planning are challenging tasks for tourists, due to their need to select Points of Interest (POI) to visit in unfamiliar cities, and to select POIs that align with their interest preferences and trip constraints. We propose an algorithm called PersTour for recommending personalized tours using POI popularity and user interest preferences, which are automatically derived from real-life travel sequences based on geo-tagged photos. Our tour recommendation problem is modelled using a formulation of the Orienteering problem, and considers user trip constraints such as time limits and the need to start and end at specific POIs. In our work, we also reflect levels of user interest based on visit durations, and demonstrate how POI visit duration can be personalized using this time-based user interest. Furthermore , we demonstrate how PersTour can be further enhanced by: (i) a weighted updating of user interests based on the recency of their POI visits; and (ii) an automatic weighting between POI popularity and user interests based on the tourist's activity level. Using a Flickr dataset of ten cities, our experiments show the effectiveness of PersTour against various collaborative filtering and greedy-based baselines, in terms of tour popularity, interest, recall, precision and F1-score. In particular, our results show the merits of using time-based user interest and personalized POI visit durations, compared to the current practice of using frequency-based user interest and average visit durations.
Proceedings of the 2018 Web Conference Companion (WWW'18), 2018
Studying large, widely spread Twitter data has laid the foundation for many novel applications fr... more Studying large, widely spread Twitter data has laid the foundation for many novel applications from predicting natural disasters and epidemics to understanding urban dynamics. Recent studies have fo-cused on exploring people's emotional response to their urban environment , e.g., green spaces versus built up areas, through analysing the sentiment of tweets within that area. Since green spaces have the capacity to improve citizen's well-being, we developed a system that is capable of recommending green spaces to users. Our system is unique in the sense that the recommendations are tailored with regard to users' preferred activity as well as the degree of positive sentiments in each green space. We show that the incoming flow of tweets can be used to refine the recommendations over time. Furthermore, We implemented a web-based, user-friendly interface to solicit user inputs and display recommendation results.
Proceedings of the 7th International Workshop on New Frontiers in Mining Complex Patterns (NFMCP'18), 2018
Twitter is increasingly used for political, advertising and marketing campaigns, where the main a... more Twitter is increasingly used for political, advertising and marketing campaigns, where the main aim is to influence users to support specific causes, individuals or groups. We propose a novel methodology for mining and analyzing Twitter campaigns, which includes: (i) collecting tweets and detecting topics relating to a campaign; (ii) mining important campaign topics using scientometrics measures; (iii) modelling user interests using hashtags and topical entropy; (iv) identifying influential users using an adapted PageRank score; and (v) various met-rics and visualization techniques for identifying bot-like activities. While this methodology is generalizable to multiple campaign types, we demonstrate its effectiveness on the 2017 German federal election.
Proceedings of the 2018 Web Conference Companion (WWW'18), 2018
Travelling and touring are popular leisure activities enjoyed by millions of tourists around the ... more Travelling and touring are popular leisure activities enjoyed by millions of tourists around the world. However, the task of travel itinerary recommendation and planning is tedious and challenging for tourists, who are often unfamiliar with the various Points-of-Interest (POIs) in a city. Apart from identifying popular POIs, the tourist needs to construct a travel itinerary comprising a subset of these POIs, and to order these POIs as a sequence of visits that can be completed within his/her available touring time. For a more realistic itinerary, the tourist also has to account for travelling time between POIs and visiting times at individual POIs. Furthermore, this itinerary should incorporate tourist preferences such as desired starting and ending POIs (e.g., POIs that are near the tourist's hotel) and a subset of must-see POIs (e.g., popular POIs that a tourist must visit). We term this the TourMustSee problem, which is based on a variant of the Orienteering problem. Following which, we propose the LP+M algorithm for solving the TourMustSee problem as an Integer Linear Program (ILP). Using a Flickr dataset of POI visits in seven touristic cities, we compare LP+M against various ILP-based baselines, and the results show that LP+M recommends better travel itineraries in terms of POI popularity, total POIs visited, total touring time utilized and must-visit POI(s) inclusion.
Tourism is a popular leisure activity and an important industry, where the main task involves vis... more Tourism is a popular leisure activity and an important industry, where the main task involves visiting unfamiliar Places-of-Interest (POI) in foreign cities. Recommending POIs and tour planning are challenging and time-consuming tasks for tourists due to: (i) the need to identify and recommend captivating POIs in an unfamiliar city; (ii) having to schedule POI visits as a connected itinerary that satisfies trip constraints such as starting/ending near a specific location (e.g., the tourist's hotel) and completing the itinerary within a limited touring duration; and (iii) having to satisfy the diverse interest preferences of each unique tourist. While tourism-related information can be obtained from the Internet, travel guides and tour agencies, many of these resources simply recommend individual POIs or popular itineraries, but otherwise do not appeal to the interest preferences of users or adhere to their trip constraints.
In contrast to existing works on next-POI prediction and top-k POI recommendation that recommend a single POI or a ranked list of POIs, the task of tour recommendation involves the need to identify a set of interesting POIs and schedule them as an itinerary with various time and space constraints. While there are works on path planning that recommend an itinerary, this itinerary is typically optimized based on a global utility such as POI popularity, and thus offer no personalization for a tourist based on his/her interest preferences. This thesis addresses the challenges associated with the automation and personalization of tour recommendation using data mining techniques to model user interest and POI-related information, and using optimization problems and techniques to formulate and solve more realistic tour recommendation problems. Our main contributions include: 1.) Proposing and implementing a framework that utilizes Flickr geo-tagged photos and Wikipedia to automatically determine user trajectories, interest preferences and POI-related information such as POI popularity and visiting times. 2.) Proposing the PersTour algorithm for recommending personalized tour itineraries based on POI popularity, users' interest preferences and trip constraints, where POI visit durations are customized based on user interests. 3.) Formulating the QueueTourRec problem for recommending queue-aware and personalized itineraries that schedule visits to popular and interesting POIs at times with minimal queuing times, and proposing a novel implementation of Monte Carlo Tree Search to solve this problem. 4.) Developing the TourRecInt algorithm for tour recommendation based on a variant of the Orienteering problem with a mandatory POI category, which is defined as the POI category that a tourist has most frequently visited. 5.) Formulating and solving the novel GroupTourRec problem, which involves recommending tour itineraries to groups of tourists with diverse interests and assigning tour guides with the right expertise to lead each tour group. 6.) Illustrating the application of our proposed approach in practice, by presenting a web-based system implementation of our PersTour algorithm, with the front-end component developed using HTML, PHP, jQuery and the Google Maps API, and the back-end based on Python, Java and PHP.
Proceedings of the 2017 IEEE International Conference on Big Data (BigData'17), 2017
Twitter is a popular microblogging service, where users frequently engage in discussions about va... more Twitter is a popular microblogging service, where users frequently engage in discussions about various topics of interest, ranging from popular topics (e.g., music) to niche topics (e.g., politics). With the large amount of tweets, a key challenge is to automatically model and determine the discussion topics without having prior knowledge of the types and number of topics, or requiring the technical expertise to define various algorithmic parameters. For this purpose, we propose the Clustering-based Topic Modelling (ClusTop) algorithm that constructs various types of word network and automatically determines the discussion topics using community detection approaches. Unlike traditional topic models, ClusTop is able to automatically determine the appropriate number of topics and does not require numerous parameters to be set. The ClusTop algorithm is also able to capture the syntactic meaning in tweets via the use of bigrams, trigrams and other word combinations in constructing the word network graph. Using three Twitter datasets with labelled crises and events as topics, ClusTop has been shown to outperform various baselines in terms of topic coherence, pointwise mutual information, precision, recall and F-score.
Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'17), 2017
Personalized itinerary recommendation is a complex and time-consuming problem, due to the need to... more Personalized itinerary recommendation is a complex and time-consuming problem, due to the need to recommend popular attractions that are aligned to the interest preferences of a tourist, and to plan these attraction visits as an itinerary that has to be completed within a specific time limit. Furthermore, many existing itinerary recommendation systems do not automatically determine and consider queuing times at attractions in the recommended itinerary, which varies based on the time of visit to the attraction, e.g., longer queuing times at peak hours. To solve these challenges, we propose the PersQ algorithm for recommending personalized itineraries that take into consideration attraction popularity, user interests and queuing times. We also implement a framework that utilizes geo-tagged photos to derive attraction popularity, user interests and queuing times, which PersQ uses to recommend personalized and queue-aware itineraries. We demonstrate the effectiveness of PersQ in the context of five major theme parks, based on a Flickr dataset spanning nine years. Experimental results show that PersQ outperforms various state-of-the-art baselines, in terms of various queuing-time related metrics, itinerary popularity, user interest alignment, recall, precision and F1-score.
Proceedings of the 2017 IEEE International Conference on Big Data (BigData'17), 2017
Topic modelling is a well-studied field that aims to identify topics from traditional documents s... more Topic modelling is a well-studied field that aims to identify topics from traditional documents such as news articles and reports. More recently, Latent Dirichlet Allocation (LDA) and its variants, have been applied on social media platforms to model and study topics relating to sports, politics and companies. While these applications were able to successfully identify the general topics, we posit that standard LDA can be augmented with spatial and temporal considerations based on the geo-coordinates and timestamps of social media posts. Towards this effort, we propose a spatial and temporal variant of LDA to better detect more specific topics, such as a particular art exhibit held at a museum or a security incident happening on a particular day. We validate our approach on a Twitter dataset and find that the detected topics are well-aligned to real-life events happening on the specific days and locations.
Extended Proceedings of the 24th Conference on User Modeling, Adaptation and Personalization (UMAP'16), Doctoral Consortium, 2016
Travel itinerary recommendation is an important but challenging problem, due to the need to recom... more Travel itinerary recommendation is an important but challenging problem, due to the need to recommend captivating Places-of-Interest (POI) and construct these POIs as a connected itinerary. Another challenge is to personalize these recommended itineraries based on tourist interests and their preferences for starting/ending POIs and time/distance budgets. Our work aims to address these challenges by proposing algorithms to recommend personalized travel itineraries for both individuals and groups of tourists, based on their interest preferences. To determine these interests, we first construct tourists' past POI visits based on their geo-tagged photos and then build a model of user interests based on their time spent visiting each POI. Experimental evaluation on a Flickr dataset of multiple cities show that our proposed algorithms out-perform various baselines in terms of recall, precision, F1-score and other heuristics-based metrics.
Proceedings of the 26th International Conference on Automated Planning and Scheduling (ICAPS'16), Doctoral Consortium, 2016
Trip planning is both challenging and tedious for tourists due to their unique interest preferenc... more Trip planning is both challenging and tedious for tourists due to their unique interest preferences and various trip constraints. Despite the availability of online resources for tour planning and services provided by tour agencies, there are various challenges such as: (i) selecting POIs that are personalized to the unique interests of individual travellers; (ii) constructing these POIs as an itinerary, with considerations for time availability and starting/ending place preferences (e.g., near a tourist's hotel); (iii) for tour agencies to group tourists into tour groups such that the recommended tour appeals to the interests of the group as a whole; and (iv) similarly, for tour agencies to assign tour guides with the right expertise to lead each of these tour groups. In our work, we aim to develop algorithms for recommending personalized tours to both individual travellers and groups of tourists, based on their interest preferences , which we automatically determine based on geo-tagged photos posted by these tourists. Using a Flickr dataset of geo-tagged photos as ground-truth for real-life POI visits in multiple cities, we evaluate our proposed algorithms using various metrics such as precision , recall, F1-score, user interest scores and POI popularity , among others.
Proceedings of the 2nd Workshop on Noisy User-generated Text (WNUT'16), 2016
Knowing the location of a social media user and their posts is important for various purposes, su... more Knowing the location of a social media user and their posts is important for various purposes, such as the recommendation of location-based items/services, and locality detection of cri-sis/disasters. This paper describes our submission to the shared task "Geolocation Prediction in Twitter" of the 2nd Workshop on Noisy User-generated Text. In this shared task, we propose an algorithm to predict the location of Twitter users and tweets using a multinomial Naive Bayes classifier trained on Location Indicative Words and various textual features (such as city/country names, #hashtags and @mentions). We compared our approach against various baselines based on Location Indicative Words, city/country names, #hashtags and @mentions as individual feature sets, and experimental results show that our approach outperforms these baselines in terms of classification accuracy, mean and median error distance.
Proceedings of the 27th ACM Conference on Hypertext and Social Media (HT'16), 2016
Authority users often play important roles in a social system. They are expected to write good re... more Authority users often play important roles in a social system. They are expected to write good reviews at product review sites; provide high quality answers in question answering systems; and share interesting content in social networks. In the context of marketing and advertising, knowing how users react to emails and messages from authority senders is important, given the prevalence of email in our everyday life. Using a real-life academic event, we designed and conducted an online controlled experiment to determine how email senders of different types of authority (depart-ment head, event organizer and a general email account) affect the range of response behavior of recipients, which includes opening the email, browsing the event website, and registering for the event. In addition, we proposed a systematic approach to analyze the user response behavior to email campaigns from the time the user receives the email till he/she browses the website in a seamless manner.
Proceedings of the 25th ACM International Conference on Information and Knowledge Management (CIKM'16), 2016
There has been a growing interest in recommending trips for tourists using location-based social ... more There has been a growing interest in recommending trips for tourists using location-based social networks. The challenge of trip recommendation not only lies in searching for relevant points-of-interest (POIs) to form a personalized trip, but also selecting the best time of day to visit the POIs. Popular POIs can be too crowded during peak times, resulting in long queues and delays. In this work, we propose the Personalized Crowd-aware Trip Recommendation (PersCT) algorithm to recommend personalized trips that also avoid the most crowded times of the POIs. We model the problem as an extension of the Orienteering Problem with multiple constraints. We extract user interests by collaborative filtering and we propose an extension of the Ant Colony Optimisation algorithm to merge user interests with POI popularity and crowdedness data to recommend trips. We evaluate our algorithm using foot traffic information obtained from a real-life pedestrian sensor dataset and user travel histories extracted from a Flickr photo dataset. We show that our algorithm out-performs several benchmarks in achieving a balance between conflicting objectives by satisfying user interests while reducing the crowdedness of the trips.
Extended Proceedings of the 27th ACM Conference on Hypertext and Social Media (HT'16), 2016
Touring is a popular but time-consuming activity, due to the need to identify interesting attract... more Touring is a popular but time-consuming activity, due to the need to identify interesting attractions or Places-of-Interest (POIs) and structure these POIs in the form of a time-constrained tour itinerary. To solve this challenge, we propose the Personalized Tour Recommendation and Planning (PersTour) system. The PersTour system is able to plan for a customized tour itinerary where the recommended POIs and visit durations are personalized based on the tourist's interest preferences. In addition, tourists have the option to indicate their trip constraints (e.g., a preferred start-ing/ending location and a specific tour duration) to further customize their tour itinerary.
Proceedings of the 26th International Conference on Automated Planning and Scheduling (ICAPS'16), 2016
Recommending and planning tour itineraries are challenging and time-consuming for tourists, hence... more Recommending and planning tour itineraries are challenging and time-consuming for tourists, hence they may seek tour operators for help. Traditionally tour operators have offered standard tour packages of popular locations, but these packages may not cater to tourist's interests. In addition, tourists may want to travel in a group, e.g., extended family, and want an operator to help them. We introduce the novel problem of group tour recommendation (GROUPTOURREC), which involves many challenges: forming tour groups whose members have similar interests; recommending Points-of-Interests (POI) that form the tour itinerary and cater for the group's interests; and assigning guides to lead these tours. For each challenge, we propose solutions involving: clustering for tourist groupings; optimizing a variant of the Orienteering problem for POI recommendations ; and integer programming for tour guide assignments. Using a Flickr dataset of seven cities, we compare our proposed approaches against various base-lines and observe significant improvements in terms of interest similarity, total/maximum/minimum tour interests and total tour guide expertise.
The immense popularity and rapid growth of Online Social Networks (OSN) have attracted the intere... more The immense popularity and rapid growth of Online Social Networks (OSN) have attracted the interest of researchers and companies, particularly in how users group together to form communities online. While many community detection algorithms have been developed to detect communities on such OSNs, most of these algorithms are based only on topological links and researchers have observed that many topological links do not translate to actual user interaction. As such, many members of the detected communities do not communicate frequently to each other. This inactivity creates a problem in targeted advertising and viral marketing, which require the community to be highly active so as to facilitate the diffusion of product/service information. We propose an approach to detect highly interactive Twitter communities that share common interests, based on the frequency and patterns of direct tweeting among users, rather than the topological information implicit in follower/following links. Our experimental results show that communities detected by our proposed approach are more cohesive and connected within different interest groups, based on topological measures. We also show that the detected communities actively interact about the specific interests, based on the high frequency of #hashtags and @mentions related to this interest. In addition, we study the trends in their tweeting patterns such as how they follow and unfollow other users, and observe that our approach detects communities comprising users whose links are more persistent compared to those in other groups of users.
Proceedings of the 24th International Joint Conference on Artificial Intelligence (IJCAI'15), Jul 2015
Tour recommendation and itinerary planning are challenging tasks for tourists, due to their need ... more Tour recommendation and itinerary planning are challenging tasks for tourists, due to their need to select Points of Interest (POI) to visit in unfamiliar cities, and to select POIs that align with their interest preferences and trip constraints. We propose an algorithm called PersTour for recommending personalized tours using POI popularity and user interest preferences, which are automatically derived from real-life travel sequences based on geo-tagged photos. Our tour recommendation problem is modelled using a formulation of the Orienteering problem, and considers user trip constraints such as time limits and the need to start and end at specific POIs. In our work, we also reflect levels of user interest based on visit durations, and demonstrate how POI visit duration can be personalized using this time-based user interest. Using a Flickr dataset of four cities, our experiments show the effectiveness of PersTour against various baselines, in terms of tour popularity, interest, recall, precision and F1-score. In particular, our results show the merits of using time-based user interest and personalized POI visit durations, compared to the current practice of using frequency-based user interest and average visit durations.
Proceedings of the 2015 SIGMOD PhD Symposium (SIGMOD'15), May 2015
Photo sharing sites like Flickr and Instagram have grown increasingly popular in recent years, re... more Photo sharing sites like Flickr and Instagram have grown increasingly popular in recent years, resulting in a large amount of uploaded photos. In addition, these photos contain useful meta-data such as the taken time and geo-location. Using such geo-tagged photos and Wikipedia, we propose an approach for recommending tours based on user interests from his/her visit history. We evaluate our proposed approach on a Flickr dataset comprising three cities and find that our approach is able to recommend tours that are more popular and comprise more places/points-of-interest, compared to various baselines. More importantly, we find that our recommended tours reflect the ground truth of real-life tours taken by users, based on measures of recall, precision and F1-score.
Uploads
In contrast to existing works on next-POI prediction and top-k POI recommendation that recommend a single POI or a ranked list of POIs, the task of tour recommendation involves the need to identify a set of interesting POIs and schedule them as an itinerary with various time and space constraints. While there are works on path planning that recommend an itinerary, this itinerary is typically optimized based on a global utility such as POI popularity, and thus offer no personalization for a tourist based on his/her interest preferences.
This thesis addresses the challenges associated with the automation and personalization of tour recommendation using data mining techniques to model user interest and POI-related information, and using optimization problems and techniques to formulate and solve more realistic tour recommendation problems. Our main contributions include:
1.) Proposing and implementing a framework that utilizes Flickr geo-tagged photos and Wikipedia to automatically determine user trajectories, interest preferences and POI-related information such as POI popularity and visiting times.
2.) Proposing the PersTour algorithm for recommending personalized tour itineraries based on POI popularity, users' interest preferences and trip constraints, where POI visit durations are customized based on user interests.
3.) Formulating the QueueTourRec problem for recommending queue-aware and personalized itineraries that schedule visits to popular and interesting POIs at times with minimal queuing times, and proposing a novel implementation of Monte Carlo Tree Search to solve this problem.
4.) Developing the TourRecInt algorithm for tour recommendation based on a variant of the Orienteering problem with a mandatory POI category, which is defined as the POI category that a tourist has most frequently visited.
5.) Formulating and solving the novel GroupTourRec problem, which involves recommending tour itineraries to groups of tourists with diverse interests and assigning tour guides with the right expertise to lead each tour group.
6.) Illustrating the application of our proposed approach in practice, by presenting a web-based system implementation of our PersTour algorithm, with the front-end component developed using HTML, PHP, jQuery and the Google Maps API, and the back-end based on Python, Java and PHP.
In contrast to existing works on next-POI prediction and top-k POI recommendation that recommend a single POI or a ranked list of POIs, the task of tour recommendation involves the need to identify a set of interesting POIs and schedule them as an itinerary with various time and space constraints. While there are works on path planning that recommend an itinerary, this itinerary is typically optimized based on a global utility such as POI popularity, and thus offer no personalization for a tourist based on his/her interest preferences.
This thesis addresses the challenges associated with the automation and personalization of tour recommendation using data mining techniques to model user interest and POI-related information, and using optimization problems and techniques to formulate and solve more realistic tour recommendation problems. Our main contributions include:
1.) Proposing and implementing a framework that utilizes Flickr geo-tagged photos and Wikipedia to automatically determine user trajectories, interest preferences and POI-related information such as POI popularity and visiting times.
2.) Proposing the PersTour algorithm for recommending personalized tour itineraries based on POI popularity, users' interest preferences and trip constraints, where POI visit durations are customized based on user interests.
3.) Formulating the QueueTourRec problem for recommending queue-aware and personalized itineraries that schedule visits to popular and interesting POIs at times with minimal queuing times, and proposing a novel implementation of Monte Carlo Tree Search to solve this problem.
4.) Developing the TourRecInt algorithm for tour recommendation based on a variant of the Orienteering problem with a mandatory POI category, which is defined as the POI category that a tourist has most frequently visited.
5.) Formulating and solving the novel GroupTourRec problem, which involves recommending tour itineraries to groups of tourists with diverse interests and assigning tour guides with the right expertise to lead each tour group.
6.) Illustrating the application of our proposed approach in practice, by presenting a web-based system implementation of our PersTour algorithm, with the front-end component developed using HTML, PHP, jQuery and the Google Maps API, and the back-end based on Python, Java and PHP.