Abstract
This paper presents a platform, called QualESTIM, for exploring socioeconomic statistical data (also called indicators). QualESTIM integrates various outlier detection methods that make it possible to evaluate the logical consistency of a dataset, and its quality in fine. Without recourse to ‘ground truth’ of some kind, data values are compared to various spatiotemporal distributions given by statistical models. However, an outlier is not necessarily an error: experts should always interpret the outlying value. That is why we claim here that such a quality assessment process has to be interactive and that metadata associated with such data should be made available in order to refine the analysis. Dedicated to outlier detection and their visualization by an expert, the platform is connected to a database that contains both the data and their metadata, structured according to an ISO 19115 profile. A case study illustrates the interest of this approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Beard, M. K., Buttenfield, B. P. and Clapham S. B. (1991). NCGIA Research Initiative 7: Visualization of Spatial Data Quality. NCGIA Technical Paper 91-26.
Bivand, R. S., Pebesma, E. J., and Gómez-Rubio, V. (2008) Applied Spatial Data Analysis with R, XIV, 378 p., Springer.
Brunsdon C, Fotheringham S, Charlton M (2007). “Geographically Weighted Discriminant Analysis.” Geographical Analysis, 39(4), pp. 376–396.
Caussinus H, Ruiz A (1990). “Interesting Projections of Multidimensional Data by Means of Generalized Principal Components Analysis.” In COMPSTAT90, pp. 121–126. Physica- Verlag, Heidelberg, Germany.
Chrisman, N. R., (1984) The role of quality information in the long-term functioning of a geographic information system. Cartographica, 21, pp. 79-87.
Chrisman, N. R., (1991) The error component in spatial data. In Longley, P. A. & Goodchild, M. F. & Maguire, D. J. & Rhind, D. W., editors, Geographic Information Systems and Science, pp. 165-174. Longman Scientific and Technical.
Clarke, D.G., and Clark, D.M., (1995), Lineage. In Guptill S.C. & Morrison J.L., editors, Elements of spatial data quality, pp. 13–30. Oxford, Elsevier.
Cheng, T., and Li, Z., (2006) A multi-scale approach for spatial-temporal outlier detection, Transactions in GIS, 10(2), pp. 253-263.
Daniel F., Casati F., Palpanas, T., Chayka O., and Cinzia C., (2008) Enabling Better Decisions through Quality-aware Reports. In: International Conference on Information Quality (ICIQ).
Dean P., and Sundgren B., (1996) Quality Aspects of a Modern Database Service. In: Proc. of the 8th Int. Conf. on Scientific and Statistical Database Management, SSDBM’96, pp. 156-161.
Gotway, C., and Young, L, (2002) “Combining incompatible spatial data”, in Journal of the American Statistical Association, (2002), 97(458) pp. 632-648
Grasland, C., and Gensel, J., (2010) ESPON 2013 Database, Final Report, December 2010.
Grubbs, F. E., (1969) Procedures for detecting outlying observations in samples. Technometrics (11), pp. 1–21.
Harris, P. and Charlton, M., (2010) “Spatial analysis for quality control, phase 1: The identification of logical input errors and statistical outliers”, The ESPON Monitoring Comittee, Tech. Rep., Esch-sur-Alzette, Luxembourg.
International Organization for Standardisation. Technical Committee 211, (2002) Geographic Information - Quality principles - ISO 19113.
International Organization for Standardisation. Technical Committee 211, (2003) Geographic Information - Quality evaluation procedures - ISO 19114.
International Organization for Standardisation. Technical Committee 211, (2003) Geographic Information -- Metadata - ISO 19115.
International Organization for Standardisation. Technical Committee 211, (2006) Geographic Information – Data quality measures - ISO 19138.
International Organization for Standardisation. Technical Committee 211, (2011) Geographic Information -- Data quality - ISO 19157.
Kubik, K., Lyons, K., Merchant, D. (1988) Photogrammetric work without blunders. Photogrammetric Engineering and Remote Sensing 54: 51-4.
Monmonier, M., (1989), Geographic brushing: enhancing exploratory analysis of the scatterplot matrix. Geographical Analysis, 21, pp. 81–84.
Plumejeaud, C., Gensel, J., and Villanova-Oliver, M., (2010) Opérationnalisation d’un profil ISO 19115 pour des métadonnées socio-économiques, INFORSID Marseille, May 25-28.
Plumejeaud C., Mathian H., Gensel J., and Grasland C., (2011), Spatio-temporal analysis of territorial changes from a multi-scale perspective, International Journal of Geographical Information Science, 25(11), pp. 1597-1612.
Rousseeuw, P. and Leroy, A., (1996) Robust Regression and Outlier Detection. John Wiley & Sons, 3rd edition.
Schneiderman, B., (1996), “The Eyes Have It: A Task by Data Type Taxonomy for Information Visualizations”, Proceedings of the 1996 IEEE Symposium on Visual Languages, pp. 336-344, Washington, DC, USA.
Servigne, S., Lesage, N. and Libourel, T. (2010) Quality Components, Standards, and Metadata, in Fundamentals of Spatial Data Quality (eds R. Devillers and R. Jeansoulin), 2010, ISTE, London, UK.
Tukey, J., (1977), Exploratory data analysis, Addison Wesley Longman Publishing Co., Inc., 688 p.
UN/ECE. (1995) Guidelines for the Modelling of Statistical Data and Metadata. Technical report, UN/ECE, New York, Geneva.
Wand, Y., and Wang, R.Y. (1996) Anchoring Data Quality Dimensions in Ontological Foundations. In: Communications of the ACM, pp. 86–95.
Acknowledgements
The research presented in this paper has been supported by the ESPON 2013 database project, of the European Spatial Planning and Observation Network for Territorial Cohesion. We would like to thank Claude Grasland for its advices, as well as Martin Charlton and Paul Harris who provided the implementation of the outlier detection methods in R. The authors would like to thank the reviewers for their comments that help to improve the paper.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Plumejeaud, C., Villanova-Oliver, M. (2012). QualESTIM: Interactive Quality Assessment of Socioeconomic Data Using Outlier Detection. In: Gensel, J., Josselin, D., Vandenbroucke, D. (eds) Bridging the Geographic Information Sciences. Lecture Notes in Geoinformation and Cartography(). Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29063-3_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-29063-3_8
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-29062-6
Online ISBN: 978-3-642-29063-3
eBook Packages: Earth and Environmental ScienceEarth and Environmental Science (R0)