[go: up one dir, main page]

Skip to main content

Normalizing Spatial Information to Improve Geographical Information Indexing and Retrieval in Digital Libraries

  • Conference paper
  • First Online:
Advances in Spatial Data Handling and GIS

Abstract

Our contribution is dedicated to geographic information contained in unstructured textual documents. The main focus of this article is to propose a general indexing strategy that is dedicated to spatial information, but which could be applied to temporal and thematic information as well. More specifically, we have developed a process flow that indexes the spatial information contained in textual documents. This process flow interprets spatial information and computes corresponding accurate footprints. Our goal is to normalize such heterogeneous grained and scaled spatial information (points, polylines, polygons). This normalization is carried out at the index level by grouping spatial information together within spatial areas and by using statistics to compute frequencies for such areas and weights for the retrieved documents.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Mountains of the south west of France.

  2. 2.

    Part of this project is supported by the Greater Pau City Council and the MIDR media library.

  3. 3.

    E.g. for word “forgotten” the truncation returns “forgot”.

  4. 4.

    E.g. for word “forgotten” the lemmatization returns “forget”.

References

  • Baccino T, Pynte J (1994) Spatial coding and discourse models during text reading. Lang Cogn Process 9:143–155

    Article  Google Scholar 

  • Cai G (2002) GeoVSM: an integrated retrieval model for geographic information. In: Egenhofer MJ, Mark DM (eds) GIScience. Lecture notes in computer science, vol 2478. Springer, Boulder, CO, USA, pp 65–79

    Google Scholar 

  • Clough P, Joho H, Purves R (2006) Judging the spatial relevance of documents for GIR. In: ECIR’06: Proceedings of the 28th European conference on IR research, April 2006, Lecture notes in computer science, vol 3936. Springer, London, UK, pp 548–552

    Google Scholar 

  • Egenhofer MJ (1991) Reasoning about Binary Topological Relations. In: Gunther O, Schek H-J (eds) SSD. Lecture notes in computer science, vol 525. Springer, Zürich, Switzerland, pp 143–160

    Google Scholar 

  • Gaio M, Sallaberry C, Etcheverry P, Marquesuzaa C, Lesbegueries J (2008) A global process to access documents’ contents from a geographical point of view. J Vis Lang Comput 19(1):3–23

    Article  Google Scholar 

  • Glander T, Dollner J (2007) Cell-based generalization of 3D building groups with outlier management. In: Samet H, Shahabi C, Schneider M (eds) GIS. ACM, Seattle, WA, USA, p 54

    Google Scholar 

  • Jones CB, Purves R (2006) GIR’05 2005 ACM workshop on geographical information retrieval. SIGIR Forum 40(1):34–37

    Article  Google Scholar 

  • Jones CB, Alani H, Tudhope D (2001) Geographical information retrieval with ontologies of place. In: Montello DR (ed) Proceedings of the conference on spatial information theory (COSIT 2001). Lecture notes in computer science, vol 2205. Springer, Heidelberg/Morro Bayand, pp 322–335

    Google Scholar 

  • Kanhabua N, Nørvag K (2008) Improving temporal language models for determining time of non-timestamped documents. In: ECDL’08: Proceedings of the 12th European conference on research and advanced technology for digital libraries, Springer, Berlin/Heidelberg, pp 358–370

    Google Scholar 

  • Le Parc-Lacayrelle A, Gaio M, Sallaberry C (2007) La composante temps dans l’information géographique textuelle. Revue Document Numérique 10(2):129–148

    Article  Google Scholar 

  • Li H, Srihari KR, Niu C, Li W (2002) Location normalization for information extraction. In: 19th international conference on computational linguistics (COLING 2002). Howard International House and Academia Sinica, Taipei, Association for Computational Linguistics

    Google Scholar 

  • Mandl T, Gey FC, Nunzio GMD, Ferro N, Larson R, Sanderson M, Santos D, Womser-Hacker C, Xie X (2007) GeoCLEF 2007: the CLEF 2007 cross-language geographic information retrieval track overview. In: Peters C, Jijkoun V, Mandl T, Muller H, Oard DW, Penas A, Petras V, Santos D (eds) CLEF. Lecture notes in computer science, vol 5152. Springer, Budapest, Hungary, pp 745–772

    Google Scholar 

  • Manning CD, Raghavan P, Schutze H (2008) Introduction to information retrieval. Cambridge University Press, New York

    Book  Google Scholar 

  • Marquesuzaà C, Etcheverry P, Lesbegueries J (2005) Exploiting geospatial markers to explore and resocialize localized documents. In: Rodriguez MA, Cruz IF, Egenhofer MJ, Levashkin S (eds) GeoS. Lecture notes in computer science, vol 3799. Springer, Mexico City, Mexico, pp 153–165

    Google Scholar 

  • Martins B, Silva MJ, Andrade L (2005) Indexing and ranking in Geo-IR systems. In: GIR’05: Proceedings of the 2005 workshop on geographic information retrieval, ACM, New York, pp 31–34

    Google Scholar 

  • Martins B, Manguinhas H, Borbinha JL (2008) Extracting and exploring the geo-temporal semantics of textual resources. In: Proceedings of the IEEE international conference on semantic computing. (ICSC’08), IEEE Computer Society, Washington, DC, USA, pp 1–9

    Google Scholar 

  • Rees T (2003) “C-squares”, a new spatial indexing system and its applicability to the description of oceanographic datasets. Oceanography 16(1):11–19

    Article  Google Scholar 

  • Robbins S, Evans AC, Collins DL, Whitesides S (2003) Tuning and comparing spatial normalization methods. In: Ellis RE, Peters TM (eds) MICCAI (2). Lecture notes in computer science, vol 2879. Springer, Montréal, Canada, pp 910–917

    Google Scholar 

  • Sallaberry C, Baziz M, Lesbegueries J, Gaio M (2007) Towards an IE and IR system dealing with spatial information in digital libraries – evaluation case study. In: ICEIS’07: Proceedings of the 9th international sonference on enterprise information systems, Funchal, Madeira, Portugal, pp 190–197

    Google Scholar 

  • Salton G, McGill MJ (1983) Introduction to modern information retrieval. McGraw-Hill, New York, NY, USA

    Google Scholar 

  • Sautter G, Bohm K, Padberg F, Tichy WF (2007) Empirical evaluation of semi-automated XML annotation of text documents with the GoldenGATE Editor. In: ECDL’07: Proceedings of the 11th European conference on digital libraries. Lecture notes in computer science, vol 4675. Springer, Budapest, Hungary, pp 357–367

    Google Scholar 

  • Savoy J (2002) Morphologie et recherche d’information. Technical report, Institut interfacultaire d’informatique, Université de Neuchatel, Neuchatel

    Google Scholar 

  • Sparck Jones K (1972) A statistical interpretation of term specificity and its application in retrieval. J Docum 28(1):11–21

    Article  Google Scholar 

  • Vaid S, Jones CB, Joho H, Sanderson M (2005) Spatio-textual indexing for geographical search on the web. In: Medeiros CB, Egenhofer MJ, Bertino E (eds) SSTD. Lecture notes in computer science, vol 3633. Springer, Angra dos Reis, Brazil, pp 218–235

    Google Scholar 

  • Visser U (2004) Intelligent information integration for the semantic web. Springer, Heidelberg

    Book  Google Scholar 

  • Zhang Q (2005) Road network generalization based on connection analysis. In: Developments in spatial data handling. Springer, Berlin/Heidelberg, pp 343–353

    Chapter  Google Scholar 

  • Zhou S, Jones CB (2004) Shape-aware line generalisation with weighted effective area. In: Fisher PF (ed) Developments in spatial data handling 11th international symposium on spatial data handling. Springer, Kyoto, Japan, pp 369–380

    Google Scholar 

  • Zhou X, Zhang Y, Lu S, Chen G (2000) On spatial information retrieval and database generalization. In: Proceedings of the Kyoto international conference on digital libraries. Kyoto, pp 380–386

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Damien Palacio .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag GmbH Berlin Heidelberg

About this paper

Cite this paper

Palacio, D., Sallaberry, C., Gaio, M. (2012). Normalizing Spatial Information to Improve Geographical Information Indexing and Retrieval in Digital Libraries. In: Yeh, A., Shi, W., Leung, Y., Zhou, C. (eds) Advances in Spatial Data Handling and GIS. Lecture Notes in Geoinformation and Cartography. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25926-5_6

Download citation

Publish with us

Policies and ethics