Abstract
Compared to extractive machine reading comprehension (MRC) limited to text spans, multi-choice MRC is more flexible in evaluating the model’s ability to utilize external commonsense knowledge. On the one hand, existing methods leverage transfer learning and complicated matching networks to solve the multi-choice MRC, which lacks interpretability for commonsense questions. On the other hand, although Transformer based pre-trained language models such as BERT have shown powerful performance in MRC, external knowledge such as unspoken commonsense and world knowledge still can not be used explicitly for downstream tasks. In this work, we present three simple yet effective injection methods plugged in BERT’s structure to fine-tune the multi-choice MRC tasks with off-the-shelf commonsense representations directly. Moreover, we introduce a mask mechanism for the token-level multi-hop relationship searching to filter external knowledge. Experimental results indicate that the incremental BERT outperforms the baseline by a considerable margin on DREAM and CosmosQA, two knowledge-driven multi-choice datasets. Further analysis shows the robustness of the incremental model in the case of an incomplete training set.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
During visualization, we use a row-wise softmax operation to normalize similarity scores over all sequence tokens.
References
Bauer L, Wang Y, Bansal M (2018) Commonsense for generative multi-hop question answering tasks. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31 - November 4, 2018, pp 4220–4230. https://aclanthology.info/papers/D18-1454/d18-1454
Chen Q, Zhu X, Ling Z, Inkpen D, Wei S (2018) Neural natural language inference models enhanced with external knowledge. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Melbourne, Australia, July 15-20, 2018, Volume 1: Long Papers, pp 2406–2417. https://doi.org/10.18653/v1/P18-1224. https://www.aclweb.org/anthology/P18-1224/
Choi E, He H, Iyyer M, Yatskar M, Yih Wt, Choi Y, Liang P, Zettlemoyer L (2018) Quac: Question answering in context. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, pp 2174–2184. http://aclweb.org/anthology/D18-1241
Clark P, Cowhey I, Etzioni O, Khot T, Sabharwal A, Schoenick C, Tafjord O (2018) Think you have solved question answering? try arc, the AI2 reasoning challenge. CoRR arXiv:abs/1803.05457
Devlin J, Chang M, Lee K, Toutanova K (2019) BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), pp 4171–4186. https://aclweb.org/anthology/papers/N/N19/N19-1423/
Dhingra B, Mazaitis K, Cohen WW (2017) Quasar: Datasets for question answering by search and reading. CoRR arXiv:abs/1707.03904
Ding M, Zhou C, Chen Q, Yang H, Tang J (2019) Cognitive graph for multi-hop reading comprehension at scale. In: Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers, pp 2694–2703. https://www.aclweb.org/anthology/P19-1259/
Hermann KM, Kociský T, Grefenstette E, Espeholt L, Kay W, Suleyman M, Blunsom P (2015) Teaching machines to read and comprehend. In: Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7-12, 2015, Montreal, Quebec, Canada, pp 1693–1701. http://papers.nips.cc/paper/5945-teaching-machines-to-read-and-comprehend
Hill F, Bordes A, Chopra S, Weston J (2016) The goldilocks principle: Reading children’s books with explicit memory representations. In: 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings. arXiv:1511.02301
Hong C, Yu J, Tao D, Wang M (2014) Image-based three-dimensional human pose recovery by multiview locality-sensitive sparse retrieval. IEEE Trans Ind Electron 62(6):3742–3751
Hong C, Yu J, Wan J, Tao D, Wang M (2015) Multimodal deep autoencoder for human pose recovery. IEEE Trans Image Process 24(12):5659–5670. https://doi.org/10.1109/TIP.2015.2487860
Hong C, Yu J, Zhang J, Jin X, Lee K (2019) Multimodal face-pose estimation with multitask manifold deep learning. IEEE Trans Ind Informatics 15 (7):3952–3961. https://doi.org/10.1109/TII.2018.2884211
Huang L, Bras RL, Bhagavatula C, Choi Y (2019) Cosmos QA: machine reading comprehension with contextual commonsense reasoning. In: Inui K, Jiang J, Ng V, Wan X (eds) Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3-7, 2019, Association for Computational Linguistics, pp 2391–2401. https://doi.org/10.18653/v1/D19-1243
Jin D, Gao S, Kao J, Chung T, Hakkani-Tür D (2020) MMM: multi-stage multi-task learning for multi-choice reading comprehension. In: The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7-12, 2020, AAAI Press, pp 8010–8017. https://aaai.org/ojs/index.php/AAAI/article/view/6310
Joshi M, Choi E, Weld DS, Zettlemoyer L (2017) Triviaqa: a large scale distantly supervised challenge dataset for reading comprehension. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, July 30 - August 4, Volume 1: Long Papers, pp 1601–1611. https://doi.org/10.18653/v1/P17-1147
Kociský T, Schwarz J, Blunsom P, Dyer C, Hermann KM, Melis G, Grefenstette E (2018) The narrativeqa reading comprehension challenge. TACL 6:317–328. https://transacl.org/ojs/index.php/tacl/article/view/1197
Lai G, Xie Q, Liu H, Yang Y, Hovy EH (2017) RACE: large-scale reading comprehension dataset from examinations. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, EMNLP 2017, Copenhagen, Denmark, September 9-11, 2017, pp 785–794. https://aclanthology.info/papers/D17-1082/d17-1082
Li Z, Niu C, Meng F, Feng Y, Li Q, Zhou J (2019) Incremental transformer with deliberation decoder for document grounded conversations. In: Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers, pp 12–21. https://www.aclweb.org/anthology/P19-1002/
Mihaylov T, Clark P, Khot T, Sabharwal A (2018) Can a suit of armor conduct electricity? A new dataset for open book question answering. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31 - November 4, 2018, pp 2381–2391. https://www.aclweb.org/anthology/D18-1260/
Mihaylov T, Frank A (2018) Knowledgeable reader: Enhancing cloze-style reading comprehension with external commonsense knowledge. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Melbourne, Australia, July 15-20, 2018, Volume 1: Long Papers, pp 821–832. https://doi.org/10.18653/v1/P18-1076. https://www.aclweb.org/anthology/P18-1076/
Miller GA, Beckwith R, Fellbaum C, Gross D, Miller KJ (1990) Introduction to wordnet: An on-line lexical database. Int J Lexicography 3(4):235–244. https://doi.org/10.1093/ijl/3.4.235
Nguyen T, Rosenberg M, Song X, Gao J, Tiwary S, Majumder R, Deng L (2016) MS MARCO: A human generated machine reading comprehension dataset. In: Proceedings of the Workshop on Cognitive Computation: Integrating neural and symbolic approaches 2016 co-located with the 30th Annual Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spain, December 9, 2016. http://ceur-ws.org/Vol-1773/CoCoNIPS_2016_paper9.pdf
Pan X, Sun K, Yu D, Chen J, Ji H, Cardie C, Yu D (2019) Improving question answering with external knowledge. In: Fisch A, Talmor A, Jia R, Seo M, Choi E, Chen D (eds) Proceedings of the 2nd Workshop on Machine Reading for Question Answering, MRQA@EMNLP 2019, Hong Kong, China, November 4, 2019, Association for Computational Linguistics, pp 27–37. https://doi.org/10.18653/v1/D19-5804
Peters ME, Neumann M, IV RLL, Schwartz R, Joshi V, Singh S, Smith NA (2019) Knowledge enhanced contextual word representations. In: Inui K, Jiang J, Ng V, Wan X (eds) Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3-7, 2019, Association for Computational Linguistics, pp 43–54. https://doi.org/10.18653/v1/D19-1005
Rajpurkar P, Zhang J, Lopyrev K, Liang P (2016) Squad: 100, 000+ questions for machine comprehension of text. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, EMNLP 2016, Austin, Texas, USA, November 1-4, 2016, pp 2383–2392. http://aclweb.org/anthology/D/D16/D16-1264.pdf
Reddy S, Chen D, Manning CD (2019) Coqa: A conversational question answering challenge. TACL 7:249–266. https://transacl.org/ojs/index.php/tacl/article/view/1572
Seo MJ, Kembhavi A, Farhadi A, Hajishirzi H (2017) Bidirectional attention flow for machine comprehension. In: 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. https://openreview.net/forum?id=HJ0UKP9ge
Speer R, Chin J, Havasi C (2017) Conceptnet 5.5: An open multilingual graph of general knowledge. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, February 4-9, 2017, San Francisco, California, USA., pp 4444–4451. http://aaai.org/ocs/index.php/AAAI/AAAI17/paper/view/14972
Sun K, Yu D, Chen J, Yu D, Choi Y, Cardie C (2019) DREAM: A challenge dataset and models for dialogue-based reading comprehension. TACL 7:217–231. https://transacl.org/ojs/index.php/tacl/article/view/1534
Talmor A, Herzig J, Lourie N, Berant J (2019) Commonsenseqa: A question answering challenge targeting commonsense knowledge. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), pp 4149–4158. https://www.aclweb.org/anthology/N19-1421/
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 4-9 December 2017, Long Beach, CA, USA, pp 5998–6008. http://papers.nips.cc/paper/7181-attention-is-all-you-need
Wang C, Jiang H (2019) Explicit utilization of general knowledge in machine reading comprehension. In: Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers, pp 2263–2272. https://www.aclweb.org/anthology/P19-1219/
Wang L, Sun M, Zhao W, Shen K, Liu J (2018) Yuanfudao at semeval-2018 task 11: Three-way attention and relational knowledge for commonsense machine comprehension. In: Proceedings of The 12th International Workshop on Semantic Evaluation, SemEval@NAACL-HLT 2018, New Orleans, Louisiana, USA, June 5-6, 2018, pp 758–762. https://aclanthology.info/papers/S18-1120/s18-1120
Wang R, Tang D, Duan N, Wei Z, Huang X, Ji J, Cao G, Jiang D, Zhou M (2020) K-adapter: Infusing knowledge into pre-trained models with adapters. CoRR arXiv:abs/2002.01808
Wang S, Yu M, Jiang J, Chang S (2018) A co-matching model for multi-choice reading comprehension. In: Gurevych I, Miyao Y (eds) Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Melbourne, Australia, July 15-20, 2018, Volume 2: Short Papers, Association for Computational Linguistics, pp 746–751. https://doi.org/10.18653/v1/P18-2118. https://www.aclweb.org/anthology/P18-2118/
Wang W, Yang N, Wei F, Chang B, Zhou M (2017) Gated self-matching networks for reading comprehension and question answering. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, July 30 - August 4, Volume 1: Long Papers, pp 189–198. https://doi.org/10.18653/v1/P17-1018
Wang X, Gao T, Zhu Z, Liu Z, Li J, Tang J (2019) KEPLER: A unified model for knowledge embedding and pre-trained language representation. CoRR arXiv:abs/1911.06136
Welbl J, Stenetorp P, Riedel S (2018) Constructing datasets for multi-hop reading comprehension across documents. TACL 6:287–302. https://transacl.org/ojs/index.php/tacl/article/view/1325
Xiong W, Du J, Wang WY, Stoyanov V (2020) Pretrained encyclopedia: Weakly supervised knowledge-pretrained language model. In: 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net. https://openreview.net/forum?id=BJlzm64tDH
Xiong W, Yu M, Chang S, Guo X, Wang WY (2019) Improving question answering over incomplete kbs with knowledge-aware reader. In: Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers, pp 4258–4264. https://www.aclweb.org/anthology/P19-1417/
Yang A, Wang Q, Liu J, Liu K, Lyu Y, Wu H, She Q, Li S (2019) Enhancing pre-trained language representations with rich knowledge for machine reading comprehension. In: Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers, pp 2346–2357. https://www.aclweb.org/anthology/P19-1226/
Yang B, Mitchell TM (2017) Leveraging knowledge bases in lstms for improving machine reading. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, July 30 - August 4, Volume 1: Long Papers, pp 1436–1446. https://doi.org/10.18653/v1/P17-1132
Yang Z, Qi P, Zhang S, Bengio Y, Cohen WW, Salakhutdinov R, Manning CD (2018) Hotpotqa: A dataset for diverse, explainable multi-hop question answering. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31 - November 4, 2018, pp 2369–2380. https://aclanthology.info/papers/D18-1259/d18-1259
Yu J, Tan M, Zhang H, Tao D, Rui Y (2019) Hierarchical deep click feature prediction for fine-grained image recognition. IEEE transactions on pattern analysis and machine intelligence
Yu J, Tao D, Wang M, Rui Y (2015) Learning to rank using user clicks and visual features for image retrieval. IEEE Trans Cybern 45(4):767–779. https://doi.org/10.1109/TCYB.2014.2336697
Zhang S, Liu X, Liu J, Gao J, Duh K, Durme BV (2018) Record: Bridging the gap between human and machine commonsense reading comprehension. CoRR arXiv:abs/1810.12885
Zhang S, Zhao H, Wu Y, Zhang Z, Zhou X, Zhou X (2020) DCMN+: dual co-matching network for multi-choice reading comprehension. In: The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7-12, 2020, AAAI Press, pp 9563–9570. https://aaai.org/ojs/index.php/AAAI/article/view/6502
Zhang Z, Han X, Liu Z, Jiang X, Sun M, Liu Q (2019) ERNIE: enhanced language representation with informative entities. In: Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers, pp 1441–1451. https://www.aclweb.org/anthology/P19-1139/
Zhong W, Tang D, Duan N, Zhou M, Wang J, Yin J (2019) Improving question answering by commonsense-based pre-training. In: Tang J, Kan M, Zhao D, Li S, Zan H (eds) Natural Language Processing and Chinese Computing - 8th CCF International Conference, NLPCC 2019, Dunhuang, China, October 9-14, 2019, Proceedings, Part I, Lecture Notes in Computer Science, vol 11838. Springer, Berlin, pp 16–28. https://doi.org/10.1007/978-3-030-32233-5_2
Zhu P, Zhao H, Li X (2020) Dual multi-head co-attention for multi-choice reading comprehension. CoRR arXiv:abs/2001.09415
Acknowledgements
We thank the funding 2020-KF-10 supported by Henan key Laboratory for Big Data Processing & Analytics of Electronic Commerce.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Li, R., Wang, L., Jiang, Z. et al. Incremental BERT with commonsense representations for multi-choice reading comprehension. Multimed Tools Appl 80, 32311–32333 (2021). https://doi.org/10.1007/s11042-021-11197-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-021-11197-0