CaLcs: Continuously Approximating Longest Common Subsequence for Sequence Level Optimization

Semih Yavuz, Chung-Cheng Chiu, Patrick Nguyen, Yonghui Wu

Abstract

Maximum-likelihood estimation (MLE) is one of the most widely used approaches for training structured prediction models for text-generation based natural language processing applications. However, besides exposure bias, models trained with MLE suffer from wrong objective problem where they are trained to maximize the word-level correct next step prediction, but are evaluated with respect to sequence-level discrete metrics such as ROUGE and BLEU. Several variants of policy-gradient methods address some of these problems by optimizing for final discrete evaluation metrics and showing improvements over MLE training for downstream tasks like text summarization and machine translation. However, policy-gradient methods suffers from high sample variance, making the training process very difficult and unstable. In this paper, we present an alternative direction towards mitigating this problem by introducing a new objective (CaLcs) based on a differentiable surrogate of longest common subsequence (LCS) measure that captures sequence-level structure similarity. Experimental results on abstractive summarization and machine translation validate the effectiveness of the proposed approach.

Anthology ID:: D18-1406
Volume:: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Month:: October-November
Year:: 2018
Address:: Brussels, Belgium
Editors:: Ellen Riloff, David Chiang, Julia Hockenmaier, Jun’ichi Tsujii
Venue:: EMNLP
SIG:: SIGDAT
Publisher:: Association for Computational Linguistics
Note:
Pages:: 3708–3718
Language:
URL:: https://aclanthology.org/D18-1406
DOI:: 10.18653/v1/D18-1406
Bibkey:
Cite (ACL):: Semih Yavuz, Chung-Cheng Chiu, Patrick Nguyen, and Yonghui Wu. 2018. CaLcs: Continuously Approximating Longest Common Subsequence for Sequence Level Optimization. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 3708–3718, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):: CaLcs: Continuously Approximating Longest Common Subsequence for Sequence Level Optimization (Yavuz et al., EMNLP 2018)
Copy Citation:
PDF:: https://aclanthology.org/D18-1406.pdf
Video:: https://aclanthology.org/D18-1406.mp4
Data: CNN/Daily Mail

PDF Cite Search Video