Abstract
Community question answering portal (CQA) has become one of the most important sources for people to seek information from the Internet. With great quantity of online users ready to help, askers are willing to post questions in CQA and are likely to obtain desirable answers. However, the answer quality in CQA varies widely, from helpful answers to abusive spam. Answer quality assessment is therefore of great significance. Most of the existing approaches evaluate answer quality based on the relevance between questions and answers. Due to the lexical gap between questions and answers, these approaches are not quite satisfactory. In this paper, a novel approach is proposed to rank the candidate answers, which utilizes the support sets to reduce the impact of lexical gap between questions and answers. Firstly, similar questions are retrieved and support sets are produced with their high quality answers. Based on the assumption that high quality answers of similar questions would also have intrinsic similarity, the quality of candidate answers are then evaluated through their distance from the support sets in both aspects of content and structure. Unlike most of the existing approaches, previous knowledge from similar question-answer pairs are used to bridged the straight lexical and semantic gaps between questions and answers. Experiments are implemented on approximately 2.15 million real-world question-answer pairs from Yahoo! Answers to verify the effectiveness of our approach. The results on metrics of MAP@K and MRR show that the proposed approach can rank the candidate answers precisely.
Similar content being viewed by others
References
Su, Q., Pavlov, D., Chow, J.-H., Baker, W.C.: Internet-scale collection of human-reviewed data. In: Proceedings of the 16th International Conference on World Wide Web, pp. 231–240. ACM (2007)
Li, B., King, I.: Routing questions to appropriate answerers in community question answering services. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, pp. 1585–1588. ACM (2010)
Zhou, G., Liu, K., Zhao, J.: Joint relevance and answer quality learning for question routing in community qa. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, pp. 1492–1496. ACM (2012)
Toba, H., Ming, Z.-Y., Adriani, M., Chua, T.-S.: Discovering high quality answers in community question answering archives using a hierarchy of classifiers. Inf. Sci. 261, 101–115 (2014)
Bian, J., Liu, Y., Agichtein, E., Zha, H.: Finding the right facts in the crowd: factoid question answering over social media. In: Proceedings of the 17th International Conference on World Wide Web, pp. 467–476. ACM (2008)
Kim, S., Oh, J.S., Oh, S.: Best answer selection criteria in a social qa site from the user oriented relevance perspective. Proc. Am. Soc. Inf. Sci. Technol. 44(1), 1–15 (2007)
Shah, C., Oh, J.S., Oh, S.: Exploring characteristics and effects of user participation in online social Q&A sites. First Monday 13(9), 18 (2008)
Zhang, K., Wu, W., Wu, H., Li, Z., Zhou, M.: Question retrieval with high quality answers in community question answering. In: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, pp. 371–380. ACM (2014)
Bouguessa, M., Dumoulin, B., Wang, S.: Identifying authoritative actors in question-answering forums: the case of Yahoo! answers. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 866–874. ACM (2008)
Zhou, Z.M., Lan, M., Niu, Z.Y., Lu, Y.: Exploiting user profile information for answer ranking in cqa. In: Proceedings of the 21st International Conference Companion on World Wide Web, pp. 767–774. ACM (2012)
Yang, L., Qiu, M., Gottipati, S., Zhu, F., Jiang, J., Sun, H., Chen, Z.: Cqarank: jointly model topics and expertise in community question answering. In: Research Collection School of Information Systems (2013)
Jeon, J., Croft, W.B., Lee, J.H.: Finding similar questions in large question and answer archives. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management, pp. 84–90. ACM (2005)
Burke, R.D., Hammond, K.J., Kulyukin, V., Lytinen, S.L., Tomuro, N., Schoenberg, S.: Articles question answering from frequently asked question files experiences with the FAQ finder system. AI Mag. 18(2), 57–66 (1997)
Berger, A., Caruana, R., Cohn, D., Freitag, D., Mittal, V.: Bridging the lexical chasm: statistical approaches to answer-finding. In: Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 192–199. ACM (2000)
Xue, X., Jeon, J., Croft, W.B.: Retrieval models for question and answer archives. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 475–482. ACM (2008)
Cai, L., Zhou, G., Liu, K., Zhao, J.: Learning the latent topics for question retrieval in community QA. IJCNLP 11, 273–281 (2011)
Cao, X., Cong, G., Cui, B., Jensen, C.S., Zhang, C.: The use of categorization information in language models for question retrieval. In: Association for Computing Machinery (2009)
Cao, X., Cong, G., Cui, B., Jensen, C.S.: A generalized framework of exploring category information for question retrieval in community question answer archives. In: WWW (2010)
Surdeanu, M., Ciaramita, M., Zaragoza, H.: Learning to rank answers on large online QA collections. In: Proceedings of the 46th Annual Meeting for the Association for Computational Linguistics: Human Language Technologies, ACL-2008: HLT, pp. 719–727 (2008)
Surdeanu, M., Ciaramita, M., Inc, G., Zaragoza, H.: Learning to rank answers to non-factoid questions from web collections. Comput. Linguist. 37(2), 351–383 (2011)
Shah, C., Pomerantz, J.: Evaluating and predicting answer quality in community QA. In: Proceedings of International Conference on Research & Development in Information Retrieval SIGIR, pp. 411–418 (2010)
Lou, J., Fang, Y., Lim, K.H., Peng, J.Z.: Contributing high quantity and quality knowledge to online Q&A communities. J. Am. Soc. Inf. Sci. Technol. 64(2), 356–371 (2013)
Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. J. ACM 46(5), 604–632 (1999)
Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: bringing order to the web. Stanford Infolab (1999)
Zhang, J., Ackerman, M.S., Adamic, L.: Expertise networks in online communities: structure and algorithms. In: Proceedings of the 16th International Conference on World Wide Web, pp. 221–230. ACM (2007)
Wang, X.J., Tu, X., Feng, D., et al.: Ranking community answers by modeling question-answer relationships via analogical reasoning. In: SIGIR Proceedings of International ACM SIGIR Conference on Research & Development in (2009)
Shah, C., Pomerantz, J.: Evaluating and predicting answer quality in community QA. In: Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 411–418. ACM (2010)
Ishikawa, D., Kando, N., Sakai, T.: What makes a good answer in community question answering? an analysis of assessors criteria. In: Proceedings of the 4th International Workshop on Evaluating Information Access (EVIA), Tokyo, Japan. Citeseer (2011)
Toutanova, K., Manning, C.D.: Enriching the knowledge sources used in a maximum entropy part-of-speech tagger. In: Proceedings of the 2000 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora: Held in Conjunction with the 38th Annual Meeting of the Association for Computational Linguistics, vol. 13, pp. 63–70. Association for Computational Linguistics (2000)
Acknowledgement
The work described in this paper was supported by National Natural Science Foundation of China (No. 61202362), National Key Basic Research Program of China (NO. 2013CB329606) and Project funded by China Postdoctoral Science Foundation (NO. 2013M542560).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Xie, Z., Nie, Y., Jin, S., Li, S., Li, A. (2015). Answer Quality Assessment in CQA Based on Similar Support Sets. In: Sun, M., Liu, Z., Zhang, M., Liu, Y. (eds) Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data. CCL NLP-NABD 2015 2015. Lecture Notes in Computer Science(), vol 9427. Springer, Cham. https://doi.org/10.1007/978-3-319-25816-4_25
Download citation
DOI: https://doi.org/10.1007/978-3-319-25816-4_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25815-7
Online ISBN: 978-3-319-25816-4
eBook Packages: Computer ScienceComputer Science (R0)