Abstract
In this paper, we present Indomilando, a Cultural Heritage Game with a Purpose (GWAP) with the aim of ranking the photos of the architectural assets in the city of Milan, according to their recognizability. Besides evaluating the ability of Indomilando to achieve its ranking purpose, we also analyze the effect of an educational incentive on the players’ engagement. Indeed, discovering new cultural assets appeared to be a valuable reason to continue playing.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
1 Introduction
Cultural heritage collections are maintained by public and private institutions and described in digital catalogues, often constituted by Web-based anthologies, The advent of digital photography have led to a significant increase – and Web availability – of the wealth of images depicting cultural heritage assets. However, given a large image collection of cultural heritage assets, the issue arises to analyze and process the set of photos without resorting to expensive manual work. Human Computation [1] has emerged as a successful paradigm to face this challenge in a cheaper way. Therefore, the following questions emerge regarding the adoption of a Human Computation approach: (1) given a set of images of a cultural heritage asset, is it possible to identify the most representative or the most recognizable photo? (2) given a collection of images of different cultural heritage assets, is it possible to tell apart the most popular assets and those that could benefit from promotion campaigns?
This paper contributes the design, implementation and evaluation of Indomilando, an application aimed to rank a quite heterogeneous set of images, depicting the cultural heritage assets of Milan. Indomilando is designed as a Web-based Game with a Purpose that involves players in guessing an asset from a set of photos. We evaluate the ability of Indomilando to achieve its ranking purpose and to engage users. Moreover, given the cultural flavour of our game, we also investigate whether Indomilando achieves the “collateral effect” of giving back new knowledge to users.
2 Related Work
Human Computation [1] and Crowdsourcing [2] are different approaches to involve people in mixed human-machine computational systems to collect and process information. In the cultural heritage field, there are studies and surveys [3, 4] that explore how to apply crowdsourcing-like methods to achieve different tasks [5]. Games with a Purpose (GWAP [6]) emerged as an interesting Human Computation method to collect information from game players that are often unaware that their playful activities hide the achievement of a task. Among possible purposes, ranking items in a collection is explored in some GWAP cases [7–9]. Several GWAPs in literature are based on multimedia elements [7, 10]; among those, some applications are specifically related to the cultural heritage domain [11]. In this paper, we describe a GWAP aimed to rank images of cultural heritage assets.
The main challenge in those Human Computation applications is modeling and predicting user engagement [12, 13]. Therefore it is important to design suitable incentive schemes to foster user participation [14]; it is demonstrated that gamification and furtherance incentives improve both quality and quantity of task execution [15]. One possible incentive can be represented by an educational stimulus [16]: users are encouraged to participate because not only they get fun and/or are paid, but also because they acquire new interesting knowledge. This type of incentive is particularly interesting in our cultural heritage case. However, balancing the purpose of a GWAP with educational-style incentives seems to be far from trivial: either an expensive training effort is needed to ensure results quality [17] or the learning stimulus negatively impacts on the purpose quality [18]. In our work, we aim to analyze and evaluate the interplay of purpose achievement, user engagement and educational-like incentives in a cultural heritage GWAP.
3 Design and Development of the Indomilando Game
We developed Indomilando (cf. http://bit.ly/indomilando), a GWAP [6] that engages players in a game to leverage their human contributions to rank cultural heritage photos. Indomilando makes use of Lombardy region’s SIRBeC data related to the architectural assets of the city of Milan: 2,104 photos depicting 685 cultural heritage assets of Milan. The purpose of Indomilando is to rank the assets’ photos according to their popularity; by popularity we mean both how famous an assets is (with respect to the other assets) and how recognizable or representative a photo is (with respect to the other photos of the same asset). To achieve this purpose, we show some photos of different assets and ask the players to guess which one corresponds to a given asset.
Indomilando can be played in rounds composed by levels; in each level, the player is presented with the name of an architectural asset and 4 photos. The game goal is to identify the correct photo for the given asset; for every right choice, the player score increases and it increases more for consecutive right answers. Gaining points in levels and rounds let the player climb the leaderboard and obtain badges. Every time a user completes a level by making a choice, Indomilando highlights the correct/incorrect answer and shows the 4 names of the assets depicted in the photos. Moreover, the game lets the user learn more about the assets: each photo is associated to a link to the corresponding SIRBeC catalogue report. At the end of the round, the player can also explore a map of Milan showing the location of all the assets the user played with. Figure 1 shows some screenshots of Indomilando gameplay.
Indomilando was published online in mid October 2015 and promoted via social media campaigns. A link to the game was recently added to several culture-related official portals of Lombardy Region. In the following sections, we present our analysis and evaluation of Indomilando based on the data collected in the last three months. Further details and graphics regarding this evaluation are available at http://swa.cefriel.it/urbangames/indomilando/icwe2016.html.
4 Game Purpose Analysis
Since Indomilando is a GWAP [6], we first analyze its ability to achieve its goal: ranking the full set of photos and ranking the depicted assets.
4.1 Photo Ranking
Whenever an asset photo is presented to a player to be guessed, the user can either correctly choose it or wrongly select one of the distracting images. To create a numeric score to rank the photos, instead of relying on the ratio between the number of times the photo was correctly chosen and the number of times the photo was visualized to be guessed (in the following simply named chosen/visualized ratio), we assign a score to the photos by using the Wilson score confidence interval [19] for a Bernoulli parameter; in our case, a trial is the visualization of a photo to be guessed, and its possible outcomes are that the photo is correctly chosen (success) or wrongly discarded (failure). We choose the lower bound of the Wilson interval as a conservative measure of the score w. We compute the w score for each asset photo.
Given that a player has to identify the correct image in a set of four photos, the success (or failure) is not only dependent on the “correct” picture, but also on the three distracting ones. The distracting options are selected among the assets belonging to the same category of the one to be guessed, but some categories are more recognizable than others. We correct the w score to take this effect into account, by applying a standardization by asset type, as follows: \(\widetilde{w} = (w - \bar{X}_t)/{S_t}\), where t indicates the photo asset type, \(\bar{X}_t\) and \(S_t\) the sample mean and the sample standard deviation, respectively, of the w scores of the same type t. We adopt \(s_{photo}=\widetilde{w}\) as the metrics to rank Indomilando asset photos.
4.2 Asset Ranking
The score of an asset could be defined as \(\bar{s}_{photo}\), the average score of all the photos depicting that asset, but a simple mean is sensitive to the characteristics of the set of images. Experimentally, we observe that the mean score of the asset’s photos \(\bar{s}_{photo}\) increases with the number \(N_a\) of photos in the set. This result could be caused by the “learning” effect that a high number of photos has on the players: a user over time could learn to identify an asset he/she had to recognize or discard in multiple previous game rounds. We introduce an adjustment for the number of photos derived from the linear regression line, as follows: \(adj_{num} = \beta _0 + \beta _1 \cdot N_a\).
To take into account the effect of the inhomogeneity of the photo set, we consider the variance \(v_a\) of the chosen/visualized rate across the same photos belonging to the asset and we introduce the following adjustment: \(adj_{set} = (v_a - \bar{X})/S\), where the variance \(v_a\) is standardized with regards to the mean \(\bar{X}\) and the standard deviation S of v across all photos.
We finally define the asset score as: \(s_{asset} = \bar{s}_{photo} - adj_{num} - adj_{set}\).
4.3 Results
We collect the game log information and compute the \(s_{photo}\) score and the \(s_{asset}\) score, taking into consideration only the images that were played by at least three players. The results are shown in the following table.
No. of players | Total effective played time | Completed photos | Completed assets | Throughput (tasks/hour) | ALP (time/player) |
---|---|---|---|---|---|
72 | 8 h 58 m 20 s | 1397 (66.4 %) | 524 (76.5 %) | 155.22 | 7 m 29 s |
The total effective time includes only the time needed to choose an image in each game level; thus in around 9 h 72 players were able to complete the ranking of 66.40 % of photos and 76.50 % of assets. The main metrics for GWAP evaluation [6] are also displayed: the average life play (ALP) is 7.5 min and measures the time spent on average by each user playing the game: it is a measure of the Indomilando engagement; the throughput is a measure of how many “tasks” are completed in the unit of time. The latter metrics are used to estimate how much time and how many users are needed to complete the tasks at hand. In the Indomilando case, we can therefore estimate that the whole set of 2,104 photos and 685 assets could be ranked in around 13.5 h by less than 110 players.
4.4 Ground Truth Evaluation
To evaluate the ability of Indomilando to achieve its purpose, we need to evaluate the photo ranking and the asset ranking. From our manual inspection of rankings, we can say that indeed the photos depicting the same asset, as well as the assets belonging to the same type, are indeed ordered from the most to the least representative.
To have a more objective evaluation of our scores, we set up a term of comparison for our rankings. We search the cultural heritage assets in the Italian version of Wikipedia, checking if they have a dedicated page: among the 685 assets of Milan, we find 111 pages. Then, we derive a rank by sorting the set of assets by decreasing number of Wikipedia visits (cf. http://stats.grok.se/), which can be interpreted as a measure of the “fame” of the asset. We first evaluate Indomilando with the rank correlation between the asset score computed by the game and the Wikipedia visits. We compute both the Spearman \(\rho \) and Kendall \(\tau \) rank correlation indicators [20]. There is indeed a positive correlation (\(\rho =0.20\), \(\tau =0.15\)) that is stronger when focusing on the 10 most visited pages in Wikipedia (\(\rho =0.60\), \(\tau =0.556\)).
We build a second reference “ground truth” by manual ordering of the first ten assets in the Wikipedia visits by involving a set of 12 participants who are asked to rank them by their recognisability. Then we aggregate the manual ranks using a weighted brute force algorithm [21]; the weighting is applied both to users (incorporating the level of familiarity with Milan) and to individual ranks (considering the possible lack of knowledge about any asset). We again compute the rank correlation \(\rho \) and \(\tau \) indicators between the Indomilando asset rank and the ordering obtained by the aggregation of the manual tests. In this case, the correlation values are much higher (\(\rho =0.806\), \(\tau =0.60\)), indicating that Indomilando is able to approximate the reference ranking with a high level of accuracy.
5 Game Engagement Analysis
We propose two types of evaluation: an objective analysis of played time and a subjective survey of participants’ experience.
5.1 Analysis of Played Time
In the experimentation period, a total of 72 users played the Indomilando game for an average ALP of 7.5 min. Figure 2 shows the distribution of the total effective played time, i.e. the time actually spent trying to guess the right answer, from the beginning to the end of a level. This specific left-skewed shape is recurring in casual games’ engagement measures. Despite the curve decreases very quickly, we can distinguish a group of players characterized by a different behaviour, corresponding to the little rise between 10 and 20 min.
We therefore suppose the existence of two sets of players: the first and larger one including the users with a total played time around 2.5 min, and a less numerous set with the users who played approximately 7–8 times more. The empirical distribution of the total number of levels played per user displays a very similar behaviour to the played time distribution and confirms our analysis, also on the existence of different groups of players.
5.2 Subjective Analysis of Engagement
We conduct also a second type of engagement assessment, by setting up an online evaluation questionnaire and asking the Indomilando players to compile it. Some of the questions were explicitly directed to understand the game reception (cf. Fig. 3). The results prove that Indomilando is indeed perceived as a fun game with a simple and intuitive gameplay. We can conclude that Indomilando has a good engagement capability.
6 Game Cultural Incentive Analysis
Given the cultural topic of the game, we hypothesize that an incentive to play is constituted by the interest in learning something new about Milan.
6.1 Analysis of “Learning” Time
During the gameplay, the players can learn more about the assets they are playing with: at the end of each level, when the names of the 4 depicted assets are displayed, and at the end of the round, when the assets are visualized on a map. We analyze the time spent between levels of the same round and the time spent between consecutive rounds; we can consider those intervals as the actual learning time. Figure 4 shows the distribution of the time between levels (summed on the round); the graph is cut at 2 min, but the between-levels time goes up to almost 7 min. The curve decreases very quickly, in most cases only 2–3 s are spent between levels, but we can notice that sometimes users spent a considerable longer time. We can speculate that in those cases, the players exploited the links to the catalogue records.
Whenever a user plays two game rounds within an interval of 15 min, we consider those rounds are consecutive. The time-between-rounds distribution shows the already observed tendency, with a visible group of between-rounds intervals around 40 s, which we can attribute to the most “curious” users that spent some time to explore the asset map.
6.2 Analysis of Learning Effect
It is worth noting that, at the end of each game level, the player not only learns if his/her answer was correct but can also see the asset names for the four photos (cf. top-right screenshot in Fig. 1). Since it can happen that the same image appears more than once as distracting option, over time this can cause the player to learn recognizing some of the assets. We evaluate this possible learning effect by measuring the average number of correct guesses per round as a function of the number of played round. We notice that the players’ precision actually improves along with the game rounds; using a simple linear regression model, we can estimate this improvement in 1.1 % per played round.
6.3 Subjective Analysis of the Educational Incentive
Some of the questions in the already mentioned survey are aimed to evaluate the educational incentive of the game, as shown in Fig. 5. The question displayed on the left investigates the incentives to continue playing as perceived by the users. While the main game features are among the highly rated incentives, it is interesting to notice that 27 % of players stated that learning new things is a stimulus to play the game. This is further proved by the question on the right in Fig. 5, asking the participants whether they learned anything new about Milan while playing. The responses distribution clearly shows that an “educational” effect was strongly perceived by Indomilando players. We can conclude that Indomilando has an evident educational “collateral effect” that makes players acquire new knowledge about cultural heritage assets.
7 Conclusions
In this paper, we presented the design, development and evaluation of Indomilando, a cultural heritage GWAP aimed to rank the assets and the photos of the collection of historical-artistic architectures of Milan. Our evaluation results support our claims that: (1) the game is effective in achieving its ranking purpose, because the resulting rank is highly correlated to a ground truth and this outcome is achieved in a very limited time; (2) Indomilando shows a good engagement potential, because most players find the game fun and we also notice a user group that spent a significantly high time in playing; and (3) the game also leads to a learning/educational effect, because players are motivated to acquire new knowledge about the Milan cultural heritage. The interplay of the ranking purpose and the educational incentive, however, is to be further investigated.
References
Law, E., von Ahn, L.: Human computation. Synth. Lect. Artif. Intell. Mach. Learn. 5(3), 1–121 (2011)
Howe, J.: The rise of crowdsourcing. Wired Mag. 14(6), 1–4 (2006)
Ridge, M.: From tagging to theorizing: deepening engagement with cultural heritage through crowdsourcing. Curator Mus. J. 56(4), 435–450 (2013)
Holley, R.: Crowdsourcing: how and why should libraries do it? D-Lib Mag. 16(3), 4 (2010)
Oomen, J., Aroyo, L.: Crowdsourcing in the cultural heritage domain: opportunities and challenges. In: Proceedings of C&T 2011, pp. 138–149. ACM (2011)
von Ahn, L.: Games with a purpose. IEEE Comput. 39(6), 92–94 (2006)
Hacker, S., Von Ahn, L.: Matchin: eliciting user preferences with an online game. In: Proceedings of CHI 2009, pp. 1207–1216. ACM (2009)
Hees, J., Khamis, M., Biedert, R., Abdennadher, S., Dengel, A.: Collecting links between entities ranked by human association strengths. In: Cimiano, P., Corcho, O., Presutti, V., Hollink, L., Rudolph, S. (eds.) ESWC 2013. LNCS, vol. 7882, pp. 517–531. Springer, Heidelberg (2013)
Celino, I., Della Valle, E., Gualandris, R.: On the effectiveness of a mobile puzzle game UI to crowdsource linked data management tasks. In: CrowdUI (2014)
Von Ahn, L., Dabbish, L.: Labeling images with a computer game. In: CHI 2004, pp. 319–326. ACM (2004)
Wieser, C., Bry, F., Bérard, A., Lagrange, R.: ARTigo: building an artwork search engine with games and higher-order latent semantic analysis. In: HCOMP (2013)
de Vreede, T., Nguyen, C., de Vreede, G.-J., Boughzala, I., Oh, O., Reiter-Palmon, R.: A theoretical model of user engagement in crowdsourcing. In: Antunes, P., Gerosa, M.A., Sylvester, A., Vassileva, J., de Vreede, G.-J. (eds.) CRIWG 2013. LNCS, vol. 8224, pp. 94–109. Springer, Heidelberg (2013)
Mao, A., Kamar, E., Horvitz, E.: Why stop now? predicting worker engagement in online crowdsourcing. In: HCOMP (2013)
Mao, A., et al.: Volunteering versus work for pay: incentives and tradeoffs in crowdsourcing. In: HCOMP (2013)
Feyisetan, O., Simperl, E., Van Kleek, M., Shadbolt, N.: Improving paid microtasks through gamification and adaptive furtherance incentives. In: WWW 2015 (2015)
von Ahn, L.: Duolingo: learn a language for free while helping to translate the web. In: IUI 2013, pp. 1–2. ACM (2013)
Beal, C.R., Morrison, C.T., Villegas, J.C.: Human computation as an educational opportunity. In: Michelucci, P. (ed.) Handbook of Human Computation, pp. 163–170. Springer, Heidelberg (2013)
Garcia, I.: Learning a language for free while translating the web. Does duolingo work? Int. J. Engl. Linguist. 3(1), 19 (2013)
Wilson, E.B.: Probable inference, the law of succession, and statistical inference. J. Am. Stat. Assoc. 22(158), 209–212 (1927)
Fagin, R., Kumar, R., Sivakumar, D.: Comparing top k lists. SIAM J. Discrete Math. 17(1), 134–160 (2003)
Pihur, V., Datta, S., Datta, S.: Weighted rank aggregation of cluster validation measures: a monte carlo cross-entropy approach. Bioinformatics 23(13), 1607–1615 (2007)
Acknowledgments
This work was supported by the SmartCulture project co-funded by Regione Lombardia (POR-FESR 2007-2013, id 40393840).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Celino, I., Fiano, A., Fino, R. (2016). Analysis of a Cultural Heritage Game with a Purpose with an Educational Incentive. In: Bozzon, A., Cudre-Maroux, P., Pautasso, C. (eds) Web Engineering. ICWE 2016. Lecture Notes in Computer Science(), vol 9671. Springer, Cham. https://doi.org/10.1007/978-3-319-38791-8_28
Download citation
DOI: https://doi.org/10.1007/978-3-319-38791-8_28
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-38790-1
Online ISBN: 978-3-319-38791-8
eBook Packages: Computer ScienceComputer Science (R0)