Multimodal Named Entity Disambiguation for Noisy Social Media Posts

Seungwhan Moon, Leonardo Neves, Vitor Carvalho

Abstract

We introduce the new Multimodal Named Entity Disambiguation (MNED) task for multimodal social media posts such as Snapchat or Instagram captions, which are composed of short captions with accompanying images. Social media posts bring significant challenges for disambiguation tasks because 1) ambiguity not only comes from polysemous entities, but also from inconsistent or incomplete notations, 2) very limited context is provided with surrounding words, and 3) there are many emerging entities often unseen during training. To this end, we build a new dataset called SnapCaptionsKB, a collection of Snapchat image captions submitted to public and crowd-sourced stories, with named entity mentions fully annotated and linked to entities in an external knowledge base. We then build a deep zeroshot multimodal network for MNED that 1) extracts contexts from both text and image, and 2) predicts correct entity in the knowledge graph embeddings space, allowing for zeroshot disambiguation of entities unseen in training set as well. The proposed model significantly outperforms the state-of-the-art text-only NED models, showing efficacy and potentials of the MNED task.

Anthology ID:: P18-1186
Volume:: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:: July
Year:: 2018
Address:: Melbourne, Australia
Editors:: Iryna Gurevych, Yusuke Miyao
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 2000–2008
Language:
URL:: https://aclanthology.org/P18-1186
DOI:: 10.18653/v1/P18-1186
Bibkey:
Cite (ACL):: Seungwhan Moon, Leonardo Neves, and Vitor Carvalho. 2018. Multimodal Named Entity Disambiguation for Noisy Social Media Posts. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2000–2008, Melbourne, Australia. Association for Computational Linguistics.
Cite (Informal):: Multimodal Named Entity Disambiguation for Noisy Social Media Posts (Moon et al., ACL 2018)
Copy Citation:
PDF:: https://aclanthology.org/P18-1186.pdf
Video:: https://aclanthology.org/P18-1186.mp4

PDF Cite Search Video