Abstract
Text representation is crucial for many natural language processing applications. This paper presents an approach to extraction of binary lexical relations (BLR) from Portuguese texts for representing phrasal cohesion mechanisms. We demonstrate how this automatic strategy may be incorporated to information retrieval systems. Our approach is compared to those using bigrams and noun phrases for text retrieval. BLR strategy is shown to improve on the best performance in an experimental information retrieval system.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bruza, P.D., van der Weide, T.P.: The Modeling and Retrieval of Documents using Index Expressions. ACM SIGIR Forum 25(2), 91–103 (1991)
Fagan, J.L.: Automatic Phrase Indexing for Document Retrieval: An Examination of Syntactic and Non-Syntactic Methods. In: Proceedings of 10th Annual International ACM SIGIR conference, pp. 91–101 (1987)
Gamallo, P., Gonzalez, M., Agustini, A., Lopes, G., Lima, V.L.S.: Mapping Syntactic Dependencies onto Semantic Relations. In: ECAI 2002, Workshop on Natural Language Processing and Machine Learning for Ontology Engineering, Lyon, France, pp. 15–22 (2002)
Gao, J., Nie, J., Wu, G., Cao, G.: Dependence language model for information retrieval. In: Proceedings of 27th Annual International ACM SIGIR conference, pp. 170–177 (2004)
Kahane, S., Polguere, A.: Formal Foundation of Lexical Functions. In: ACL 2000 – Workshop on Collocation, Toulouse (2001)
Katz, B., Lin, J.: REXTOR: A System for Generating Relations from Natural Language. In: ACL 2000 – Workshop on Recent Advances in NLP and IR, Hong-Kong, University of Science and Technology (2000)
Lee, C., Lee, G.G.: Probabilistic information retrieval model for a dependency structured indexing system. Information. Processing and Management 41, 161–175 (2005), Available online 19 December 2003
Lin, J.: Indexing and Retrieving Natural Language using Ternary Expressions. Master thesis, Massachusetts Institute of Technology, Cambridge (2001)
Liu, S., Liu, F., Yu, C., Meng, W.: An effective approach to document retrieval via utilizing WordNet and recognizing phrases. In: Proceedings of 27th Annual International ACM SIGIR conference, pp. 266–272 (2004)
Losee, R.M.: Term Dependence: a basis for Luhn and Zipf Models. Journal of the American Society for Information Science 52(12), 1019–1025 (2001)
Matsumura, A., Takasu, A., Adachi, J.: The Effect of Information Retrieval Method Using Dependency Relationship Between Words. RIAO – Multimedia Information Representation and Retrieval (2000)
Miller, D.H., Leek, T., Schwartz, R.: A Hidden Markov Model information retrieval system. In: Proceedings of 22th Annual International ACM SIGIR conference, pp. 214–221 (1999)
Mira Mateus, M.H., Brito, A.M., Duarte, I., Faria, I.H.: Gramática da Língua Portuguesa. Lisboa: Ed. Caminho (2003)
Nallapati, R., Allan, J.: Capturing term dependencies using a language model based on sentence trees. In: Proceedings of the 11th International Conference on Information and Knowledge Management, CIKM, pp. 383–390 (2002)
Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Information Processing and Management 24, 513–523 (1988)
Song, F., Croft, B.: A general language model for information retrieval. In: CIKM, pp. 316–321 (1999)
Sparck-Jones, K.: Search Term relevance weighting given little relevance information. Journal of Documentation 35, 30–48 (1979)
Spark-Jones, K., Walker, S., Robertson, S.E.: A Probabilistic Model of Information Retrieval: Development and Comparative Experiments – Part 1 and 2. Information Processing and Management 36(6), 779–840 (2000)
Srikanth, M., Srihari, R.: Biterm language models for document retrieval. In: Proceedings of 25th Annual International ACM SIGIR conference, pp. 425–426 (2002)
Vilares, J., Barcala, F.M., Alonso, M.A.: Using Syntactic dependency-pairs conflation to improve retrieval performance in Spanish. In: Computational Linguistics and Intelligent Text Processing. Lectures Notes in Computer Science, Springer, Heidelberg (2002)
Voorhees, E.M.: Overview of TREC 2003. NIST Special Publication – SP500-255. In: The 12th Text Retrieval Conference, Gaithersburg (2003)
Wondergem, B., van Bommel, P., Weide, T.P.: Nesting and Defoliation of Index Expressions for Information Retrieval. Knowledge and Information Systems 2(1) (2000)
Zhai, C.: Fast statistical parsing of noun phrases of document indexing. In: Proceedings of the fifth conference on Applied natural language processing, pp. 312–319 (1997)
Ziviani, N.: Text Operations. In: Baeza-Yates, R., Ribeiro-Neto, B. (eds.) Modern Information Retrieval, ACM Press, New York (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gonzalez, M., Strube de Lima, V.L., Valdeni de Lima, J. (2005). Binary Lexical Relations for Text Representation in Information Retrieval. In: Montoyo, A., Muńoz, R., Métais, E. (eds) Natural Language Processing and Information Systems. NLDB 2005. Lecture Notes in Computer Science, vol 3513. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11428817_3
Download citation
DOI: https://doi.org/10.1007/11428817_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26031-8
Online ISBN: 978-3-540-32110-1
eBook Packages: Computer ScienceComputer Science (R0)