OntoTouTra: Tourist Traceability Ontology Based on Big Data Analytics
<p>Tourist traceability system: use case.</p> "> Figure 2
<p>Snippet of the image of the upper levels of OntoTouTra (using WebVOWL [<a href="#B45-applsci-11-11061" class="html-bibr">45</a>]).</p> "> Figure 3
<p>OntoTouTra architecture.</p> "> Figure 4
<p>OntoTouTra development model.</p> "> Figure 5
<p>Web scraping class.</p> "> Figure 6
<p>Listing of Data link to GeoNames for obtaining city coordinates.</p> "> Figure 7
<p>Results of data link to GeoNames for obtaining city coordinates.</p> "> Figure 8
<p>Distribution of the scores of the top 10 nationalities of reviewers of Colombia’s tourist reviews dataset obtained from OntoTouTra (language: English).</p> "> Figure 9
<p>An example of transformation rules from the Cities spreadsheet.</p> "> Figure 10
<p>Big Data lifecycle [<a href="#B16-applsci-11-11061" class="html-bibr">16</a>].</p> "> Figure 11
<p>Python code snippet about OTA web scraping.</p> "> Figure 12
<p>Example of ontology visualization: Main tourist destinations in Colombia.</p> "> Figure 13
<p>Example of the visualization of tourist destinations in Colombia from OntoTouTra.</p> "> Figure 14
<p>Application of sentiment analysis techniques to determine the Satisfaction KPI in Colombia.</p> "> Figure 15
<p>Example of satisfaction KPI (Colombia): positive reviews of the destinations. Obtained from OntoTouTra.</p> "> Figure 16
<p>Example of the polarity and subjectivity of the reviews about the Colombian destinations. Obtained from OntoTouTra.</p> "> Figure 17
<p>Architecture diagram for the data pipeline.</p> "> Figure 18
<p>Review data stream: unstructured.</p> "> Figure 19
<p>Rating predictor algorithm.</p> "> Figure 20
<p>Performance of the rating prediction model.</p> ">
Abstract
:1. Introduction
2. Related Work
3. The Ontology: OntoTouTra
3.1. Tourist Traceability System
- POI: What are the busiest POIs? What type of visitors frequent them? In what time slot are they visited? Where do the tourists come from? Later, where do they go? What activities do they mostly do? What tourist experiences are enjoyed?
- Seasonality: What is the behavior of seasonality in the destination? What activities are carried out due to seasonality? What services do they consume due to seasonality? What is the offer of tourist experiences?
- Suppliers: What is the level of satisfaction with the services provided? What are the needs to satisfy the demand?
- Stakeholders: How do stakeholders interact at the beginning, during, and at end of the visit to the tourist destination? What suggestions do tourists have regarding this service chain?
3.2. OntoTouTra Analysis
- DMOs provide the service that the tourist consumes;
- The tourists live the experiences in the destination;
- The tourist attractions are the push factor and motivator for the tourist;
- The destination is the geographical location where tourist traceability happens.
3.3. Development of the Ontology on the Domain of Tourism Traceability
3.3.1. Specification
3.3.2. Conceptualization
Ontology Main Class | Data Source (Individuals) | Linked Data | Data Sources Used in This Research |
---|---|---|---|
Tourist | social networks: OTA, eWOM | foaf | [48,49,50,51] |
Experience | tourist providers’ datasets (DMOs) | MinCIT-Open Data [52], DataEco [53] | |
Provider | government providers’ datasets | MinCIT [54] | |
City | social networks | GeoNames | [48] |
Attraction | social networks, IoT (POI wireless transmitters) | GeoNames | [48], beacons |
Hotel | social networks: eWOMs, OTAs | [48] | |
Review | social networks: eWOMs, OTAs | time | [48] |
Term | Synonym | Acronym | Description | Type |
---|---|---|---|---|
Attraction | Point-of-Interest | PoI | A place of interest where tourist visit for its value or significance. | Class |
Tourist | Visitor | A person who travels away from their normal residential region for a temporary period of at least one night, to the extent that their behavior involves a search for leisure experiences from interactions with features or characteristics of places he/she chooses to visit. | Class | |
Tourist experience | TE | A set of activities in which individuals engage on their personal terms, such as pleasant and memorable places, allowing each tourist to build his or her own travel experiences so that these satisfy a wide range of personal needs. | Class | |
Destination | City | A geographical area consisting of all the services and infrastructure necessary for the stay of a specific tourist or tourism segment. | Class | |
Provider | Supplier | All businesses offering tourism services and experiences to consumers when the latter are traveling and performing tourism activities. | Class | |
Review | Opinion | A subjective opinion of a tourist’s experience. | Subclass |
3.3.3. Formalization and Implementation
3.3.4. Evaluation
3.3.5. Documentation
3.4. Model for the Development of OntoTouTra
3.4.1. Definition of the Ontology’s Purpose
3.4.2. Data Sources
3.4.3. Data Collecting
3.4.4. Tourist Location Dataset
3.4.5. Tourist Reviews Dataset
3.4.6. Ontology Input Data Files
3.4.7. Ontology Building
- Layer 1 corresponds to the input data, mainly from ubiquitous computing sources, such as social networks, sensors located at the destination, and users’ mobile devices. This process was carried out through a data analysis pipeline, where we applied qualitative and quantitative techniques when examining the data to provide valuable insight. Data analytics provides the means to examine the EDA and CDA findings. Using EDA, we explored the data to find patterns and relationships among different ontology elements. Furthermore, through CDA, we obtained conclusions to specific questions of the tourism domain, based mainly on the simple observation of the data.
- Layer 2 is the logical layer, achieved by reasoning from OWL/RDF storage. The reason is limited according to the domain and range restrictions defined in the ontology. Using this layer, we can explain the content, apply queries, and verify the integrity of the ontology.
- Layer 3 corresponds to the presentation; OntoTouTra allows data visualization with different SPARQL endpoints, APIs, and graph visualization tools.
3.4.8. Ontology Validation
4. Development and Usage of OntoTouTra in Big Data Environments
4.1. Big Data Analytics Lifecycle for Building the TTS Ontology
4.1.1. Business Case Evaluation
4.1.2. Data Identification
4.1.3. Data Acquisition and Filtering
4.1.4. Data Extraction
4.1.5. Data Validation and Cleansing
4.1.6. Data Aggregation and Representation
4.1.7. Data Analysis
4.1.8. Data Visualization
4.1.9. Utilization of Analysis Results
4.2. Using Big Data
4.2.1. Components of the Analytics Toolkit
4.2.2. Variety of Data
4.2.3. Big Data Semantics
- The identification of relevant terms from a large and messy data source. Web-scraping techniques allowed obtaining, cleaning, and filtering the data from the tourist social networks sites. Due to the volume, variety, and velocity features, Big Data pipelines were designed and implemented for data processing;
- Significance and value of the domain. NLP techniques were applied to filter the terms to build the knowledge base of the ontology;
- Ontology construction: Big Data provided facilities for the data preprocessing so that later, an ontological building tool facilitated the creation of the thesaurus, the classifications, the taxonomy, the concept sets, the link between concepts, documentation, grouping in collections, mapping employing concept schemes, inference, and mapping link;
- The reasoning. The bidirectional relationship of Big Data semantics was fundamental in the application of the OntoTouTra ontology. The semantic basis was the ontology. For instance, we set axioms that determined the polarity of the tourist reviews.
4.2.4. Classification Using Big Data
- Refinement of the ontology: A vocabulary was generated with NLP techniques (see Section 4.2.3) to obtain the glossary of the TTS domain to implement the stages of the specification and conceptualization of the ontology (see Section 3.3, Table 3);
- Data validation and cleaning: Using data-mining and text-mining techniques, we applied text preprocessing to the tourist reviews (see Section 4.1 and Section 4.2.3 and Figure 19), such as tokenization to obtain terms by removing spaces in blank and other punctuation symbols; removal of numbers so as not to affect the review sentiment measurement; elimination of stopwords; removal of scores; stemming according to language; and applying filters to determine the effect of a denial;
- Classification of reviews: The reviews provided us with different categories of data, and based on these categories, we were able to classify them. Not all categories were present in a review. Depending on the category, we applied supervised- and unsupervised-machine-learning classification algorithms. Table 8 depicts the categories identified in the reviews and the type of classification algorithm used depending on whether the reviews had labels;
- Prediction of reviews rating: We used a bidirectional-LSTM-network-based classifier to predict ratings using the vocabulary generated from the review terms (see Section 4.2.3 and Figure 19);
- Data visualization: Using the programming and processing model, MapReduce, we generated Big Data datasets with a distributed and parallel algorithm on a cluster. We used the map procedure to filter and sort the displayed data, and we executed the summary operations with the reduce method. An example is the heat map visualization in Figure 13, where we mapped the country’s regions and reduced the hotels count by region to represent them on a map with the plotly.express library.
5. Evaluation
5.1. Evaluation of the Ontology
5.2. Conceptual Validation
5.3. Ontology Testing
6. Analysis of the Results
7. Data Treatment
8. Discussion and Conclusions
Supplementary Materials
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
Abbreviations
API | application programming interface |
CDA | confirmatory data analysis |
CQ | competency questions |
DMO | destination management organization |
EDA | exploratory data analysis |
eWOM | electronic word-of-mouth |
GQM | goal–question–metric approach |
IoT | Internet of Things |
ISO | International Organization for Standardization |
KM | knowledge management |
KPI | key performance indicator |
NLP | natural language processing |
OntoTouTra | Ontology for Tourist Traceability |
OTA | online travel agency |
OWL | Web Ontology Language |
POI | point of interest |
RDF | Resource Description Framework |
RDFS | Resource Description Framework Schema |
SPARQL | SPARQL Protocol and RDF Query Language |
ToSs | terms of service |
TTS | tourist traceability system |
UNWTO | United Nations World Tourism Organization |
Appendix A. Ontology Repository
References
- Chantre Astaiza, A.; Fuentes-Moraleda, L.; Muñoz-Mazón, A.; Ramirez-Gonzalez, G. Science Mapping of Tourist Mobility 1980–2019. Technological Advancements in the Collection of the Data for Tourist Traceability. Sustainability 2019, 11, 4738. [Google Scholar] [CrossRef] [Green Version]
- Schuitemaker, R.; Xu, X. Product traceability in manufacturing: A technical review. Procedia CIRP 2020, 93, 700–705. [Google Scholar] [CrossRef]
- ISO. ISO 12875:2011. Traceability of Finfish Products. Available online: https://www.iso.org/obp/ui/#iso:std:iso:12875:ed-1:v1:en (accessed on 2 November 2019).
- GS1. The GS1 Traceability Standard: What You Need to Know; Technical Report; Global Office: Brussels, Belgium, 2007; Available online: https://www.gs1.org/docs/traceability/GS1_tracebility_what_you_need_to_know.pdf (accessed on 2 November 2019).
- Chandrasekaran, B.; Josephson, J.; Benjamins, V.R. What Are Ontologies, and Why Do We Need Them? IEEE Intell. Syst. Their Appl. 1999, 14, 20–26. [Google Scholar] [CrossRef] [Green Version]
- Xiang, Z.; Gretzel, U.; Fesenmaier, D. Semantic Representation of Tourism on the Internet. J. Travel Res. 2009, 47, 440–453. [Google Scholar] [CrossRef]
- Tribe, J.; Liburd, J.J. The tourism knowledge system. Ann. Tour. Res. 2016, 57, 44–61. [Google Scholar] [CrossRef]
- Mouhim, S.; Aoufi, A.; Cherkaoui, C.; Hassan, D.; Mammass, D. A knowledge Management Approach Based on Ontologies: The Case of tourism. Int. J. Comput. Sci. Emerg. Technol. 2011, 2, 362–369. [Google Scholar]
- Uschold, M.; Grüninger, M. Ontologies: Principles, methods and applications. Knowl. Eng. Rev. 1996, 11, 93–136. [Google Scholar] [CrossRef] [Green Version]
- Missikoff, M.; Taglino, F. An Ontology-based Platform for Semantic Interoperability. In Handbook on Ontologies; Staab, S., Studer, R., Eds.; Springer: Berlin/Heidelberg, Germany, 2004; pp. 617–633. [Google Scholar]
- Carloni, O. Boolean Formulas of Simple Conceptual Graphs SGBF. In Proceedings of the Second International Conference on Graph Structures for Knowledge Representation and Reasoning, Barcelona, Spain, 16 July 2011; pp. 18–67. [Google Scholar]
- Siorpaes, K.; Bachlechner, D. OnTour: Tourism Information Retrieval based on YARS. In Proceedings of the 3rd European Semantic Web Conference (ESWC 2006), Budva, Montenegro, 11–June 2006. [Google Scholar]
- Prantner, K.; Ding, Y.; Luger, M.; Yan, Z.; Herzog, C. Tourism ontology and semantic management system: State-of-The-Arts analysis. In Proceedings of the IADIS International Conference: IADIS, Vila Real, Portugal, 5–8 October 2007. [Google Scholar]
- Siricharoen, W.V. Using Ontologies for E-tourism. In Proceedings of the 4th WSEAS/IASME International Conference on Engineering Education (EE 2007) Proceeding, Crete Island, Greece, 24–26 July 2007. [Google Scholar]
- Zhao, X.; Liu, L.; Wang, H.; Song, W. Ontology Construction of the Field of Tourism in Africa. In Proceedings of the 2015 8th International Symposium on Computational Intelligence and Design (ISCID), Hangzhou, China, 12–13 December 2015; pp. 47–50. [Google Scholar] [CrossRef]
- Erl, T.; Khattak, W.; Buhler, P. Big Data Fundamentals: Concepts, Drivers & Techniques; ServiceTech Press: Englewood Cliffs, NJ, USA, 2016. [Google Scholar]
- Huang, Y.; Bian, L. Using Ontologies and Formal Concept Analysis to Integrate Heterogeneous Tourism Information. IEEE Trans. Emerg. Top. Comput. 2015, 3, 172–184. [Google Scholar] [CrossRef]
- Valls, A.; Gibert, K.; Orellana, A.; Antón-Clavé, S. Using ontology-based clustering to understand the push and pull factors for British tourists visiting a Mediterranean coastal destination. Inf. Manag. 2018, 55, 145–159. [Google Scholar] [CrossRef]
- Miller, G.; Beckwith, R.; Fellbaum, C.; Gross, D.; Miller, K. Introduction to WordNet: An On-line Lexical Database. Int. J. Lexicogr. 1991, 3, 235–244. [Google Scholar] [CrossRef] [Green Version]
- Islam, M.R.; Hossain, B.A.; Imteaj, M.N.; Akhter, S.; Jogesh, H.S.; Mostafa, M.B. OnTraNetBD: A knowledgebase for the travel network in bangladesh. In Proceedings of the 2017 IEEE Region 10 Humanitarian Technology Conference (R10-HTC), Dhaka, Bangladesh, 21–23 December 2017; pp. 170–174. [Google Scholar]
- Giunchiglia, F.; Dutta, B. DERA: A Faceted Knowledge Organization Framework. In Proceedings of the International Conference on Theory and Practice of Digital Libraries, Lyon, France, 25–27 August 2011. [Google Scholar]
- Suchanek, F.; Kasneci, G.; Weikum, G. Yago: A Large Ontology from Wikipedia and WordNet. J. Web Semant. 2008, 6, 203–217. [Google Scholar] [CrossRef] [Green Version]
- Rodríguez-García, M.; Valencia-García, R.; Garcia-Sanchez, F.; Samper Zapater, J.J. Creating a semantically-enhanced cloud services environment through ontology evolution. Future Gener. Comput. Syst. 2014, 32, 295–306. [Google Scholar] [CrossRef]
- Llorens, J.; Morato, J.; Génova, G.; Fuentes, J.; Quintana, V.; Díaz, I. RHSP: An Information Representation Model Based on Relationship. Stud. Fuzziness Soft Comput. 2004, 159, 221–253. [Google Scholar]
- Santamaria-Granados, L.; Mendoza-Moreno, J.F.; Ramirez-Gonzalez, G. Tourist Recommender Systems Based on Emotion Recognition—A Scientometric Review. Future Internet 2021, 13, 2. [Google Scholar] [CrossRef]
- Chu, Y.; Wang, H.; Zheng, L.; Wang, Z.; Tan, K.L. TRSO: A Tourism Recommender System Based on Ontology. In Proceedings of the International Conference on Knowledge Science, Engineering and Management, Passau, Germany, 5–7 October 2016; Volume 9983. [Google Scholar]
- Guergour, H.E.; Boufaïda, Z. A domain ontology building process based on principles of social web. In Proceedings of the 2012 International Conference on Information Technology and e-Services, Las Vegas, NV, USA, 16–18 April 2012; pp. 1–6. [Google Scholar]
- Moreno, A.; Valls, A.; Isern, D.; Marin, L.; Borràs, J. SigTur/E-Destination: Ontology-based personalized recommendation of Tourism and Leisure Activities. Eng. Appl. Artif. Intell. 2013, 26, 633–651. [Google Scholar] [CrossRef]
- Shoval, N.; Ahas, R. The use of tracking technologies in tourism research: The first decade. Tour. Geogr. 2016, 18, 587–606. [Google Scholar] [CrossRef]
- Girardin, F.; Calabrese, F.; Dal Fiore, F.; Ratti, C.; Blat, J. Digital Footprinting: Uncovering Tourists with User-Generated Content. IEEE Pervasive Comput. 2009, 7, 36–43. [Google Scholar] [CrossRef] [Green Version]
- Mariani, M.; Borghi, M. Effects of the Booking.com rating system: Bringing hotel class into the picture. Tour. Manag. 2018, 66, 47–52. [Google Scholar] [CrossRef] [Green Version]
- Lytvyn, V.; Vysotska, V.; Burov, Y.; Demchuk, A. Architectural Ontology Designed for Intellectual Analysis of E-Tourism Resources. In Proceedings of the 2018 IEEE 13th International Scientific and Technical Conference on Computer Sciences and Information Technologies (CSIT), Lviv, Ukraine, 11–14 September 2018; Volume 1, pp. 335–338. [Google Scholar]
- Lee, C.I.; Hsia, T.C.; Hsu, H.C.; Lin, J.Y. Ontology-based tourism recommendation system. In Proceedings of the 2017 4th International Conference on Industrial Engineering and Applications (ICIEA), Nagoya, Japan, 27–29 April 2017; pp. 376–379. [Google Scholar]
- Smirnov, A.; Ponomarev, A.; Shilov, N.; Kashevnik, A.; Teslya, N. Ontology-Based Human-Computer Cloud for Decision Support: Architecture and Applications in Tourism. Int. J. Embed. Real-Time Commun. Syst. 2018, 9, 1–19. [Google Scholar] [CrossRef] [Green Version]
- Prasamuarso Kuntarto, G.; Gunawan, I.; Moechtar, F.; Ahmadin, Y.; Santoso, B.I. Dwipa Ontology III: Implementation of Ontology Method Enrichment on Tourism Domain. Int. J. Smart Sens. Intell. Syst. 2017, 10, 903–919. [Google Scholar]
- Borràs, J.; Flor, J.; Perez, Y.; Moreno, A.; Valls, A.; Isern, D.; Orellana, A.; Russo, A.; Clavé, S. SigTur/E-Destination: A System for the Management of Complex Tourist Regions. In Information and Communication Technologies in Tourism; Springer: Vienna, Austria, 2011; pp. 39–50. [Google Scholar]
- Wick, M. GeoNames Ontology; Technical Report; Unxos GmbH: Wollerau, Switzerland, 2015; Available online: http://download.geonames.org/export/dump/readme.txt (accessed on 21 March 2019).
- Frontini, F.; Del Gratta, R.; Monachini, M. GeoDomainWordNet: Linking the GeoNames Ontology to WordNet. In Proceedings of the Language and Technology Conference, Poznań, Poland, 7–9 December 2016; Volume 9561, pp. 229–242. [Google Scholar]
- Team, G. GeoNames Webservice Subdivision Levels. Available online: https://www.GeoNames.org/export/subdiv-level.html (accessed on 21 March 2019).
- DANE. Geovisor de Consulta de Codificación de la Divipola. Available online: https://geoportal.dane.gov.co/geovisores/territorio/consulta-divipola-division-politico-administrativa-de-colombia/ (accessed on 21 March 2019).
- Cox, S.; Little, C. Time Ontology in Owl. Available online: https://www.w3.org/TR/owl-time/ (accessed on 21 March 2019).
- International Open Data Charter ODC. ODC Principles. Available online: https://opendatacharter.net/adopt-the-charter/ (accessed on 21 March 2019).
- Ministerio de Tecnologías de la Información y las Comunicaciones. Datos Abiertos. Available online: https://www.datos.gov.co/ (accessed on 21 March 2019).
- Situr Boyacá. Sistema de Información Turística de Boyacá. Available online: https://situr.boyaca.gov.co/ (accessed on 21 March 2019).
- Lohmann, S.; Negru, S.; Haag, F.; Ertl, T. Visualizing Ontologies with VOWL. Semant. Web 2016, 7, 399–419. [Google Scholar] [CrossRef] [Green Version]
- Fernández-López, M.; Gomez-Perez, A.; Juristo, N. METHONTOLOGY: From ontological art towards ontological engineering. In Proceedings of the Engineering Workshop on Ontological Engineering (AAAI97), Stanford, CA, USA, 24–26 March 1997. [Google Scholar]
- Kumara, B.; Paik, I.; Zhang, J.; Siriweera, T.H.A.; Koswatte, K. Ontology-Based Workflow Generation for Intelligent Big Data Analytics. In Proceedings of the Conference: IEEE International Conference on Web Services (ICWS 2015), New York, NY, USA, 27 June–2 July 2015. [Google Scholar] [CrossRef]
- Booking. Booking.com Home Page. Available online: https://www.booking.com/ (accessed on 9 April 2019).
- Expedia. Expedia.com Home Page. Available online: https://www.expedia.com/ (accessed on 9 April 2019).
- Airbnb. Airbnb.com Home Page. Available online: https://www.airbnb.com/ (accessed on 9 April 2019).
- TripAdvisor. TripAdvisor.com Home Page. Available online: https://www.tripadvisor.com/ (accessed on 9 April 2019).
- MinCIT. Prestadores Registro Nacional de Turismo—Datos Abiertos. Available online: https://www.datos.gov.co/Comercio-Industria-y-Turismo/Prestadores-Registro-Nacional-de-Turismo/npkw-6rke (accessed on 21 March 2019).
- Bermudez, Y.; Aponte, A.; Zuluaga, V.; Moreno, C.; Ceballos, O. Prototipo de Publicación de Datos Turísticos Apoyados en Linked Open Data Para el Consumo de Información del Sector Ecoturístico en el Centro del Valle del Cauca. Available online: https://bibliotecadigital.univalle.edu.co/handle/10893/14492 (accessed on 21 March 2019).
- Ministerio de Comercio, Industria y Turismo. Informes de Turismo. Available online: https://www.mincit.gov.co/estudios-economicos/estadisticas-e-informes/informes-de-turismo (accessed on 21 March 2019).
- Osorio, M.; Garijo, D. Ontology-Based APIs (OBA). Available online: https://oba.readthedocs.io/en/latest/ (accessed on 17 September 2020).
- Musen, M. The Protégé Project: A Look Back and a Look Forward. AI Matters 2015, 1, 4–12. [Google Scholar] [CrossRef] [PubMed]
- Hardi, J. Cellfie Plugin. Available online: https://github.com/protegeproject/cellfie-plugin (accessed on 11 October 2019).
- Gomez-Perez, A.; Fernández-López, M.; Corcho, O. Ontological Engineering: With Examples from the Areas of Knowledge Management, E-Commerce and the Semantic Web; Springer Science & Business Media: New York, NY, USA, 2004. [Google Scholar]
- Steiner, C.; Albert, D. Validating domain ontologies: A methodology exemplified for concept maps. Cogent Educ. 2017, 4, 1263006. [Google Scholar] [CrossRef]
- Glimm, B.; Horrocks, I.; Motik, B.; Stoilos, G.; Wang, Z. HermiT: An OWL 2 Reasoner. J. Autom. Reason. 2014, 53, 245–269. [Google Scholar] [CrossRef] [Green Version]
- Loshin, D. Big Data Analytics; Morgan Kaufmann: Amsterdam, The Netherlands, 2013. [Google Scholar]
- Bornhorst, T.; Ritchie, J.; Sheehan, L. Determinants of Tourism Success for DMOs & Destinations: An Empirical Examination of Stakeholders’ Perspectives. Tour. Manag. 2010, 31, 572–589. [Google Scholar]
- Emani, C.; Cullot, N.; Nicolle, C. Understandable Big Data: A survey. Comput. Sci. Rev. 2015, 17, 70–81. [Google Scholar] [CrossRef]
- Ceravolo, P.; Azzini, A.; Angelini, M.; Catarci, T.; Cudre-Mauroux, P.; Damiani, E.; Mazak, A.; Van Keulen, M.; Jarrar, M.; Santucci, G.; et al. Big Data Semantics. J. Data Semant. 2018, 7, 65–85. [Google Scholar] [CrossRef]
- Lytvyn, V.; Vysotska, V.; Veres, O.; Brodyak, O.; Oryshchyn, O. Big Data analytics ontology. Technol. Audit. Prod. Reserv. 2017, 1, 16–27. [Google Scholar] [CrossRef]
- Gruber, T.R. Toward principles for the design of ontologies used for knowledge sharing? Int. J. Hum. Comput. Stud. 1995, 43, 907–928. [Google Scholar] [CrossRef]
- Classification, O.O. Birte Glimm and Ian Horrocks and Boris Motik and Giorgos Stoilos. In Proceedings of the 9th International Semantic Web Conference (ISWC 2010), Shanghai, China, 7–11 November 2010; Patel-Schneider, P.F., Pan, Y., Hitzler, P., Mika, P., Zhang, L., Pan, J.Z., Horrocks, I., Glimm, B., Eds.; Springer: Shanghai, China, 2010; Volume 6496, pp. 225–240. [Google Scholar]
- Poveda-Villalón, M.; Gomez-Perez, A.; Suárez-Figueroa, M.C. OOPS! (OntOlogy Pitfall Scanner!): An on-line tool for ontology evaluation. Int. J. Semant. Web Inf. Syst. 2014, 10, 7–34. [Google Scholar] [CrossRef] [Green Version]
- Bandeira, J.; Bittencourt, I.; Espinheira, P.; Isotani, S. FOCA: A Methodology for Ontology Evaluation. arXiv 2016, arXiv:1612.03353. [Google Scholar]
- Ferrari, S.; Cribari-Neto, F. Beta Regression for Modelling Rates and Proportions. J. Appl. Stat. 2004, 31, 799–815. [Google Scholar] [CrossRef]
- Bezerra, C.; Freitas, F.; da Silva Santana, F. Evaluating Ontologies with Competency Questions. In Proceedings of the 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT), Atlanta, GA, USA, 17–20 November 2013; pp. 284–285. [Google Scholar] [CrossRef]
- Office for National Statistics. Measuring Tourism Locally; ONS: Newport, UK, 2010.
- UNWTO. Country Fact Sheets–Colombia. Available online: https://webunwto.s3.eu-west-1.amazonaws.com/s3fs-public/2020-10/colombia.pdf (accessed on 4 February 2020).
- UNWTO. Tourism Seasonality across Destinations. Available online: https://www.unwto.org/seasonality (accessed on 4 February 2020).
- Tantau, T. The TikZ and PGF Packages–Manual for Version 3.1.9a; Institut für Theoretische Informatik, Universität zu Lübeck: Lubeck, Germany, 2021. [Google Scholar]
- Chaves, M.; Trojahn, C. Towards a Multilingual Ontology for Ontology-driven Content Mining in Social Web Sites. In Proceedings of the ISWC 2010 Workshops, Shanghai, China, 7–8 November 2010. [Google Scholar]
- Sicilia, M.A. Handbook of Metadata, Semantics and Ontologies; World Scientific: Singapore, 2013; pp. 393–406. [Google Scholar] [CrossRef]
- Booking. Trip Terms and Conditions. Available online: https://www.booking.com/content/terms.html (accessed on 9 April 2019).
- Krotov, V.; Silva, L. Legality and Ethics of Web Scraping. In Proceedings of the Twenty-Fourth Americas Conference on Information Systems, New Orleans, LA, USA, 16–18 August 2018. [Google Scholar]
- Mahto, D.K.; Singh, L. A dive into Web Scraper world. In Proceedings of the 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom), New Delhi, India, 16–18 March 2016; pp. 689–693. [Google Scholar]
Ontology | Year | Purpose | TTS Concepts Covered? |
---|---|---|---|
Architectural ontology [32] | 2018 | e-tourism resources | No. It has an architectural domain. |
OnTraNetBD [20] | 2017 | Uses WorNet for mapping key concepts | No. The ontology establishes the formal relationship between tourist attractions and other travel elements, but not the space–time causality of the tourist. |
Ontology-Based Tourism Recommendation System [33] | 2017 | Travel ontology | Partially. It defines a travel recommendation system based on ontologies, but does not analyze tourists’ routes in the destination. |
Ontology-Based Human–Computer Cloud [34] | 2017 | Building ad hoc decision support services | No. It describes various decision support scenarios in tourism in general, but not specifically for the TTS. |
Dwipa Ontology III [35] | 2017 | Cultural parks, artists, and monuments | No. It is limited to POIs. |
TRSO [26] | 2016 | Recommender system for tourists | Partially. It determines the relationship of tourists with the context to suggest tourist information. |
SigTur/E-Destination [36] | 2011 | Activities and guides | No. It provides a catalog of destination resources to offer personalized information to tourists. |
Mondeca [13] | 2011 | Profiling tourist and cultural objects | Partially. Mondeca has a large number of concepts on tourism, but it is not freely available. |
Moroccan Tourism [8] | 2011 | Ontology of this destination city | No. It is limited to presenting the importance of the knowledge domain in tourism. |
University of Karlsruhe [13] | 2007 | OnTourism project for evaluating the Semantic Web | No. They analyzed seven tourism ontologies and five management tools to create ontologies. |
OnTour project [12] | 2006 | Accommodation and activities | No. It focuses to e-tourism. |
Harmonize Ontology [10] | 2004 | Exchange data between organizations | No. It is aimed at developing an interoperability platform for SMEs in the tourism sector. |
No | Relationship | No. | Relationship |
---|---|---|---|
1 | belongs | 9 | hasService |
2 | enjoys | 10 | hasServiceCategory |
3 | hasAccommodationType | 11 | hasStateParent |
4 | hasCityParent | 12 | located |
5 | hasCountryParent | 13 | offered |
6 | hasHotel | 14 | operates |
7 | hasHotelScore | 15 | uses |
8 | hasScoreCategory | 16 | visits |
OTA | Founded | Listings | Audience | Countries | Languages |
---|---|---|---|---|---|
Booking.com | 1996 | 28 M | 50 M | 200 | 43 |
Skyscanner | 2001 | 2 M | 60 M | 49 | 30 |
Expedia | 1996 | 590 K | 50 M | 75 | 35 |
TripAdvisor | 2000 | 7.3 M | 490 M | 48 | 28 |
Agoda | 1998 | 2 M | 2.3 M | 65 | 38 |
Airbnb | 2008 | 7 M | 750 M | 220 | 89 |
HostelWorld | 1999 | 36 K | 13 M | 178 | 20 |
Hotelbeds | 2001 | 180 K | 60 K | 185 | 20 |
Software | Use | Function |
---|---|---|
Spark/PySpark | data mining | PySpark Dataframe for Big Data entities: reviews, hotel services, and scores. |
MongoDB | data mining | Temporary storage for NoSQL collections, mainly tourist reviews. |
Python | data mining/queries | Scripting for all functions: scraping, ontology API, loading of individuals, queries, and visualization. |
RDFLib | queries | SPARQL API interface. |
Selenium | data mining | OTA web scraping. |
NLTK | data mining | Definition of ontology classes and terms. Analysis of tourist reviews for queries. |
Item | Count |
---|---|
Reviews | 1,009,469 |
Services | 481,443 |
Hotels | 11,071 |
Destinations | 678 |
OntoTouTra axioms | 698 |
Logical axiom | 352 |
Declaration axioms | 190 |
Class count | 65 |
Object property | 16 |
Data property | 109 |
SubClass Of | 57 |
OntoTouTra axioms | 17,225,580 |
Category | Classifier | Algorithm ot Tool |
---|---|---|
Determine the polarity | Supervised | nltk.sentiment.sentiment_analyzer |
Grouping by ratings | Not supervised | K-means |
Detection of services | Supervised | Named entity recognition (NER) with SpaCy |
Detection of tourist experiences | Supervised | NER with SpaCy |
Detection of POIs | Supervised | NER with SpaCy |
Detection of language | Supervised | nltk.stem |
Goal | Question | Metric | Note | Question Grade | Goal Grade |
---|---|---|---|---|---|
1. Check if the ontology complies with substitutes | Q1. Were the competency questions defined? | Completeness | 13 KPIs as CQ | 100 | 83.3 |
Q2. Were the competency questions answered? | Completeness | 13 KPIs answered | 100 | ||
Q3. Did the ontology reuse other ontologies? | Adaptability | Open link data with GeoNames and Time Ontology | 50 | ||
2. Check if the ontology complies with ontological commitments | Q4. Did the ontology impose a minimal ontological commitment? | Conciseness | Ontology uses abstractions to define concepts | 75 | 75 |
Q5. Did the ontology impose a maximum ontological commitment? | Conciseness | Ontology does not use many primitive concepts | - | ||
Q6. Are the ontology properties coherent with the domain? | Consistency | Checked by HermiT reasoning (Protégé plugin) | 75 | ||
3. Check if the ontology complies with intelligent reasoning | Q7. Are there contradictory axioms? | Consistency | Checked by HermiT reasoning (Protégé plugin) | 100 | 100 |
Q8. Are there redundant axioms? | Conciseness | Checked by HermiT reasoning (Protégé plugin) | 100 | ||
4. Check if the ontology complies with efficient computation | Q9. Did the reasoner bring modeling errors? | Computational efficiency | 1 minor error; Checked by OOPS! | 75 | 75 |
Q10. Did the reasoner perform quickly? | Computational efficiency | Depending on Protégé capacity (we ran without the reviews’ individuals: 17.197 ms) | 75 | ||
5. Check if the ontology complies with human expression | Q11. Is the documentation consistent with modeling? | Clarity | Documentation generated by Protégé | 100 | 100 |
Q12. Were the concepts well written? | Clarity | We used the ontology annotations (rdfs:comment) | 100 | ||
Q13. Are there annotations in the ontology that show the definitions of the concepts? | Clarity | We used the ontology annotations (rdfs:comment) | 100 |
Box | KPI | Indicator |
---|---|---|
1 | 01 | % of visitors who rate the overall visitor experience as good or excellent |
1 | 02 | % of customers who consider the overall impression of the WiFi service to be good or excellent |
2 | 03 | Number of day visitors |
3 | 04 | Number of tourism enterprises (accommodation) per 10,000 population |
3 | 05 | Ratio of number of reviews to local population |
3 | 06 | Population rate with hotel influence |
2 | 07 | Foreign tourist arrivals (FTAs) |
2 | 08 | Inbound and domestic tourism |
2 | 09 | Seasonality patterns |
2 | 10 | Tourist experiences |
Test Case | KPI | Expected Results | Comparison Sources | Source’s Data | Results Obtained | Note |
---|---|---|---|---|---|---|
T001 | 1 | Over 60 % of visitors rated the experience as good or excellent | - | 71.56% | ||
T002 | 2 | In Colombia, over 50% of customers considered the WiFi service to be good or excellent | - | 53.5% | ||
T003 | 3 | In Colombia, in 2019, over 1000 reviews per day | Colombia’s Fact Sheets [73] pages 1–2 | 4,100,000 annual (2019) | 2423 (mean) | Booking’s reviewers represent the 21.57% visitors |
T004 | 4 | In Colombia, two (2) accommodation enterprises per 10,000 population | Colombia’s Fact Sheets [73] page 4 | 5.6 | 2.33 | 28,000 establishments/50 million inhabitants = 5.6. Booking = 2.33 |
T005 | 5 | The number of reviews depends on the local tourism industry (33 departments in Colombia) | [54] page 18 | Bogotá, Antioquia, Bolívar | Bogotá, Antioquia, Bolívar | Top-3 departments |
T006 | 6 | Population rate with hotel influence depends on the local tourism industry | Colombia’s Tourism Report [54] page 28 | San Andrés, Bolívar, Bogotá | Bogotá, San Andrés, Valle | Top 3 departments |
T007 | 7 | Top 10 foreign tourist arrivals (FTAs) in Colombia | Colombia’s Tourism Report [54] page 7 | USA, Peru, France | USA, France, Argentina | Top 3 countries |
T008 | 8 | Inbound and domestic tourism in Colombia per department | Colombia’s Fact Sheets [73] pages 1–2 | 4,100,000 | 459,322 | Inbound travels |
T009 | 9 | Seasonality patterns per month of 2019 in Colombia | UNWTO Seasonality [74] | January–March, July–August | January–April, July–August | Peak seasons |
T010 | 10 | Top 10 Tourist experiences in Colombia | - | Beach, tours, game room | Top 3 tourist experiences |
Item | Feature | Tool |
---|---|---|
1 | SPARQL Interface | Apache Jena |
Apache Jena Fuseki | ||
Protégé | ||
OpenLink Virtuoso | ||
2 | Web interface | RDFLib/Dash |
WebVOWL/TikZ [75] | ||
3 | REST API | Fuseki SOH |
Ontology-Based API (OBA) | ||
4 | Documentation | Protégé |
OBA |
Item | Domain | Use | Axioms |
---|---|---|---|
OntoTouTra | Tourist traceability | Decision-making at the destination | OWL |
Mondeca | Tourism | Tourism concepts | OWL |
HarmoNET | Tourism | Accommodation | OWL |
Travel Itinerary | Travel | Tourist itineraries | OWL |
Hontology | Hotel | Hotels | OWL |
OnTour Project | e-Tourism | Accommodation | OWL |
COTRIN | Open Travel Alliance (OTA) specifications | Travel industry | XML schema |
LA_DMS project | DMO | Tourist destination | OWL-S |
Hi-Touch project | Tourism products | Customer’s expectations | OWL |
TAGA | Travel concepts | Simulations | OWL |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Mendoza-Moreno, J.F.; Santamaria-Granados, L.; Fraga Vázquez, A.; Ramirez-Gonzalez, G. OntoTouTra: Tourist Traceability Ontology Based on Big Data Analytics. Appl. Sci. 2021, 11, 11061. https://doi.org/10.3390/app112211061
Mendoza-Moreno JF, Santamaria-Granados L, Fraga Vázquez A, Ramirez-Gonzalez G. OntoTouTra: Tourist Traceability Ontology Based on Big Data Analytics. Applied Sciences. 2021; 11(22):11061. https://doi.org/10.3390/app112211061
Chicago/Turabian StyleMendoza-Moreno, Juan Francisco, Luz Santamaria-Granados, Anabel Fraga Vázquez, and Gustavo Ramirez-Gonzalez. 2021. "OntoTouTra: Tourist Traceability Ontology Based on Big Data Analytics" Applied Sciences 11, no. 22: 11061. https://doi.org/10.3390/app112211061