Comparing explicit and predictive distributional semantic models endowed with syntactic contexts

Pablo Gamallo¹

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

In this article, we introduce an explicit count-based strategy to build word space models with syntactic contexts (dependencies). A filtering method is defined to reduce explicit word-context vectors. This traditional strategy is compared with a neural embedding (predictive) model also based on syntactic dependencies. The comparison was performed using the same parsed corpus for both models. Besides, the dependency-based methods are also compared with bag-of-words strategies, both count-based and predictive ones. The results show that our traditional count-based model with syntactic dependencies outperforms other strategies, including dependency-based embeddings, but just for the tasks focused on discovering similarity between words with the same function (i.e. near-synonyms).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Syntactic word embedding based on dependency syntax and polysemous analysis

Article 01 April 2018

A comparative evaluation and analysis of three generations of Distributional Semantic Models

Article Open access 02 March 2022

Context Representation with Word Embeddings for WSD

Notes

https://levyomer.wordpress.com/2014/04/25/dependency-based-word-embeddings/
code.google.com/p/word2vec/
We use bow to refer to linear bag-of-word contexts, which must be distinguished from continuous bag-of-words (CBOW). Unlike linear bag-of-words, CBOW uses continuous distributed representation of the context. It is a learning strategy that tries to predict a given word given its context, instead of predicting the context given a word as in the skip-gram model.
The number of target words differs from predictive models due to multiple heuristics and thresholds (hyperparameters) used to generate both predictive and count-based models.
https://www.openthesaurus.de/
http://fegalaz.usc.es/~gamallo/resources/count-models.tar.gz

References

Agirre, E., Alfonseca, E., Hall, K., Kravalova, J., Paşca, M., & Soroa, A. (2009). A study on similarity and relatedness using distributional and wordnet-based approaches. In Proceedings of human language technologies: The 2009 annual conference of the North American chapter of the Association for Computational Linguistics, NAACL ’09 (pp. 19–27).
Baroni, M., & Lenci, A. (2010). Distributional memory: A general framework for corpus-based semantics. Computational Linguistics, 36(4), 673–721.
Article Google Scholar
Baroni, M., Bernardi, R., & Zamparelli, R. (2014a). Frege in space: A program for compositional distributional semantics. LiLT, 9, 241–346.
Google Scholar
Baroni, M., Dinu, G., & Kruszewski, G. (2014b). Don’t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors. In Proceedings of the 52nd annual meeting of the Association for Computational Linguistics (Volume 1: Long papers) (pp. 238–247). Baltimore, Maryland.
Biemann, C., & Riedl, M. (2013). Text: Now in 2d! A framework for lexical expansion with contextual similarity. Journal of Language Modelling, 1(1), 55–95.
Article Google Scholar
Blacoe, W., & Lapata, M. (2012). A comparison of vector-based representations for semantic composition. In Empirical methods in natural language processing—EMNLP-2012 (pp. 546–556). Jeju Island, Korea.
Bordag, S. (2008) A comparison of co-occurrence and similarity measures as simulations of context. In 9th CICLing (pp. 52–63).
Bullinaria, J . A., & Levy, J. P. (2007). Extracting semantic representations from word co-occurrence statistics: A computational study. Behavior Research Methods, 39(3), 510–526.
Article Google Scholar
Bullinaria, J. A., & Levy, J. P. (2013). Limiting factors for mapping corpus-based semantic representations to brain activity. PLoS One, 8(3), e57191.
Article Google Scholar
Chen, Z. (2003). Assessing sequence comparison methods with the average precision criterion. Bioinformatics, 19, 2456–2460.
Article Google Scholar
Collobert, R., & Weston, J. (2008). A unified architecture for natural language processing: Deep neural networks with multitask learning. In International conference on machine learning. ICML.
Curran, J. R., & Moens, M. (2002). Improvements in automatic thesaurus extraction. In ACL workshop on unsupervised lexical acquisition (pp. 59–66). Philadelphia.
Dunning, T. (1993). Accurate methods for the statistics of surprise and coincidence. Computational Linguistics, 19(1), 61–74.
Google Scholar
Fellbaum, C. (1998). A semantic network of english: The mother of all WordNets. Computer and the Humanities, 32, 209–220.
Article Google Scholar
Finkelstein, L., Gabrilovich, E., Matias, Y., Rivlin, E., Solan, Z., Wolfman, G., et al. (2002). Placing search in context: The concept revisited. ACM Transactions on Information Systems, 20(1), 116–131.
Article Google Scholar
Freitag, D., Blume, M., Byrnes, J., Chow, E., Kapadia, S., Rohwer, R., et al. (2005). New experiments in distributional representations of synonymy. In Proceedings of the ninth conference on computational natural language learning (pp. 25–32).
Gamallo, P. (2008). Comparing window and syntax based strategies for semantic extraction. In PROPOR-2008. Lecture Notes in Computer Science (pp. 41–50). Springer.
Gamallo, P. (2009). Comparing different properties involved in word similarity extraction. In 14th Portuguese conference on artificial intelligence (EPIA’09), LNCS (Vol. 5816, pp. 634–645). Aveiro: Springer.
Gamallo, P. (2015). Dependency parsing with compression rules. In International workshop on parsing technology (IWPT 2015), Bilbao, Spain.
Gamallo, P., & Bordag, S. (2011). Is singular value decomposition useful for word simalirity extraction. Language Resources and Evaluation, 45(2), 95–119.
Article Google Scholar
Gamallo, P., & González, I. (2011). A grammatical formalism based on patterns of part-of-speech tags. International Journal of Corpus Linguistics, 16(1), 45–71.
Article Google Scholar
Gamallo, P., Agustini, A., & Lopes, G. (2005). Clustering syntactic positions with similar semantic requirements. Computational Linguistics, 31(1), 107–146.
Article Google Scholar
Goldberg, Y., & Nivre, J. (2012). A dynamic oracle for arc-eager dependency parsing. In COLING 2012, 24th international conference on computational linguistics proceedings of the conference: Technical papers, 8–15 (pp. 959–976). Mumbai, India.
Grefenstette, G. (1993). Evaluation techniques for automatic semantic extraction: Comparing syntactic and window-based approaches. In Workshop on acquisition of lexical knowledge from text SIGLEX/ACL. Columbus, OH.
Harris, Z. (1985). Distributional structure. In J. Katz (Ed.), The philosophy of linguistics (pp. 26–47). New York: Oxford University Press.
Google Scholar
Hofmann, M. J., & Jacobs, A. M. (2014). Interactive activation and competition models and semantic context: From behavioral to brain data. Neuroscience and Biobehavioral Reviews, 46(Part 1), 85–104.
Article Google Scholar
Hofmann, M., Kuchinke, L., Biemann, C., Tamm, S., & Jacobs, A. (2011). Remembering words in context as predicted by an associative read-out model. Frontiers in Psychology, 252(2), 85–104.
Google Scholar
Huang, E., Socher, R., & Manning, C. (2012). Improving word representations via global context and multiple word prototypes. In ACL-2012 (pp. 873–882). Jeju Island, Korea.
Landauer, T., & Dumais, S. (1997). A solution to Plato’s problem: The latent semantic analysis theory of acquision, induction and representation of knowledge. Psychological Review, 10(2), 211–240.
Article Google Scholar
Lebret, R., & Collobert, R. (2015). Rehabilitation of count-based models for word vector representations. In Gelbukh, A. F. (Ed.), CICLing (1), Springer, Lecture Notes in Computer Science (Vol. 9041, pp. 417–429).
Levy, O., & Goldberg, Y. (2014a). Dependency-based word embeddings. In Proceedings of the 52nd annual meeting of the Association for Computational Linguistics, ACL 2014, June 22–27, 2014, Baltimore, MD, USA (pp. 302–308).
Levy, O., & Goldberg, Y. (2014b) Linguistic regularities in sparse and explicit word representations. In Proceedings of the eighteenth conference on Computational Natural Language Learning, CoNLL 2014, Baltimore, Maryland, USA, June 26–27, 2014 (pp. 171–180).
Levy, O., & Goldberg, Y., (2014c) Neural word embedding as implicit matrix factorization. In Advances in neural information processing systems 27: Annual conference on neural information processing systems 2014 (December) (pp. 2177–2185). Montreal, Quebec, Canada.
Levy, O., Goldberg, Y., & Dagan, I. (2015). Improving distributional similarity with lessons learned from word embeddings. Transactions of the Association for Computational Linguistics, 3, 211–225.
Google Scholar
Lin, D. (1998). Automatic retrieval and clustering of similar words. In COLING-ACL’98, Montreal.
Lu, C. H., Ong, C. S., Hsub, W. L., & Leeb, H. K. (2011). Using filtered second order co-occurrence matrix to improve the traditional co-occurrence model. In Computer technologies and information sciences, Department of Computer Science and Information Engineering, National Taiwan University, http://www.osti.gov/eprints/topicpages/documents/record/803/2113132.html.
Mikolov, T., Yih, W., & Zweig, G., (2013). Linguistic regularities in continuous space word representations. In Proceedings of the 2013 conference of the North American Chapter of the Association for Computational Linguistics: Human language technologies (pp. 746–751). Atlanta, Georgia.
Padó, S., & Lapata, M. (2007). Dependency-based construction of semantic space models. Computational Linguistics, 33(2), 161–199.
Article Google Scholar
Padró, M., Idiart, M., Villavicencio, A., & Ramisch, C. (2014). Nothing like good old frequency: Studying context filters for distributional thesauri. In Proceedings of the 2014 conference on empirical methods in natural language processing, EMNLP 2014, October 25–29, 2014, Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL (pp. 419–424).
Peirsman, Y., Heylen, K., & Speelman, D. (2007). Finding semantically related words in Dutch, co-occurrences versus syntactic contexts. In CoSMO workshop (pp. 9–16). Roskilde, Denmark.
Seretan, V., & Wehrli, E. (2006). Accurate collocation extraction using a multilingual parser. In21st international conference on computational linguistics and the 44th annual meeting of the ACL (pp. 953–960).
Turney, P. (2001). Mining the web for synonyms: PMI-IR versus LSA on TOEFL. In 12th European conference of machine learning (pp. 491–502).
Turney, P. D. (2006). Similarity of semantic relations. Computational Linguistics, 32(3), 379–416.
Article Google Scholar
Zhu, P. (2015). N-Grams based linguistic search engine. International Journal of Computational Linguistics Research, 6(1), 1–7.

Download references

Acknowledgments

This research has been partially funded by the Spanish Ministry of Economy and Competitiveness through project FFI2014-51978-C2-1-R. We are very grateful to Omer Levy and Yoav Goldberg for sending us the parsed corpus used to build their embeddings. Moreover, we are also very grateful to the reviewers for their useful comments and suggestions.

Author information

Authors and Affiliations

Centro de Investigación en Tecnoloxías da Información (CITIUS) Campus Vida Universidade de Santiago de Compostela, 15782, Santiago de Compostela, Galiza, Spain
Pablo Gamallo

Authors

Pablo Gamallo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pablo Gamallo.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gamallo, P. Comparing explicit and predictive distributional semantic models endowed with syntactic contexts. Lang Resources & Evaluation 51, 727–743 (2017). https://doi.org/10.1007/s10579-016-9357-4

Download citation

Published: 13 May 2016
Issue Date: September 2017
DOI: https://doi.org/10.1007/s10579-016-9357-4

Comparing explicit and predictive distributional semantic models endowed with syntactic contexts

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Syntactic word embedding based on dependency syntax and polysemous analysis

A comparative evaluation and analysis of three generations of Distributional Semantic Models

Context Representation with Word Embeddings for WSD

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Comparing explicit and predictive distributional semantic models endowed with syntactic contexts

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Syntactic word embedding based on dependency syntax and polysemous analysis

A comparative evaluation and analysis of three generations of Distributional Semantic Models

Context Representation with Word Embeddings for WSD

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now