Abstract
Community based question and answer websites; such as, Yahoo Answers, Stack Overflow, Stack Exchange and Quora are briskly generating content in the form of questions and answers posted by users. These online discussion forums allow the users to append and edit the content easily. Therefore, a huge amount of content has accumulated over the time, which is beneficial for the search engine users for addressing their queries to the related content. On the other hand, the task of maintaining the answer quality and ranking of retrieved results has always been a challenging task for the websites that accumulate user’s generated data. Here, the problem of ranking the answers based on the user’s satisfaction is investigated. The best answers based on their ranking are identified. Three techniques are proposed, which are Relevance Features (REL_FS), Readability Features (RED_FS), and Weighted Answer Rank WEIGHTED ANS_RANK. REL_FS performs a summation of relevance features, RES_FS sums the readability features and WEIGHTED ANS_RANK performs a summation of REL_FS and RED_FS to deal with the task. Proposed methods compute the rank of answers according to both the relevance and readability of answers. Experimental results show proposed methods ability of providing effective ranking of answers for the selection of best answers. Moreover, it also measures the consistency of ranked answers with the help of a detailed correlational analysis.






Similar content being viewed by others
Data availability
The yahoo answers dataset has been obtained from Yahoo web-scope with the authorized permission of Yahoo Answer for research purposes only. Interested readers can obtain the data from Yahoo web-scope.
Notes
academicrelations@yahoo-inc.com or visit http://research.yahoo.com/Academic Relations for details on obtaining Yahoo! Webscope data sets.
References
Liu Q et al (2011) Predicting web searcher satisfaction with existing community-based answers. In: Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp 415–424
Kitsios F, Mitsopoulou E, Moustaka E, Kamariotou M (2022) User-generated content behavior and digital tourism services: a SEM-neural network model for information trust in social networking sites. Int J Inf Manag Data Insights 2(1):100056
Schwarzenberg P, Figueroa A (2023) Textual pre-trained models for gender identification across Community question-answering members. IEEE Access 11:3983–3995
Wang XJ, Tu X, Feng D, Zhang L (2009) Ranking community answers by modeling question-answer relationships via analogical reasoning. In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp 179–186
Omondiagbe OP, Licorish SA, Macdonell SG (2022) Evaluating simple and complex models’ performance when predicting accepted answers on stack overflow. In: 2022 48th Euromicro Conference on Software Engineering and Advanced Applications (SEAA), pp 29–38
Etemadi R, Zihayat M, Feng K, Adelman J, Bagheri E (2023) Embedding-based team formation for community question answering. Inf Sci 623:671–692
Lu Q, Zhang Y (2022) A multi-objective optimization model considering users’ satisfaction and multi-type demand response in dynamic electricity price. Energy 240:122504
Momtazi S, Klakow D (2010) Yahoo! answers for sentence retrieval in question answering. In: The Workshop Programme, pp 28
Dalip DH, Gonçalves MA, Cristo M, Calado P (2013) Exploiting user feedback to learn to rank answers in q&a forums: a case study with stack overflow. In: Proceedings of the 36th international ACM SIGIR Conference on Research and Development in Information Retrieval, pp 543–552
Kritikopoulos A, Sideri M, Varlamis I (2007) Success Index: Measuring the efficiency of search engines using implicit user feedback. In: The 11th Pan-Hellenic Conference on Informatics, Special Session on Web Search and Mining
Amjad T, Shaheen Z, Daud A (2022) Advanced learning analytics: aspect based course feedback analysis of MOOC forums to facilitate instructors. IEEE Transactions on Computational Social Systems. https://doi.org/10.1109/TCSS.2022.3174640
Robertson S, Zaragoza H (2009) The probabilistic relevance framework: BM25 and beyond. Found Trends® Inf Retr 3:333–389
Salton G, Wong A, Yang CS (1975) A vector space model for automatic indexing. Commun ACM 18(11):11
Harper FM, Moy D, Konstan JA (2009) Facts or friends? Distinguishing informational and conversational questions in social Q&A sites. In: Proceedings of the Sigchi Conference on Human Factors in Computing Systems, pp 759–768
Zhou G, Zhou Y, He T, Wu W (2016) Learning semantic representation with neural networks for community question answering retrieval. Knowl -Based Syst 93:75–83
Liu Y, Bian J, Agichtein E (2008) Predicting information seeker satisfaction in community question answering. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp 483–490
Beygelzimer A, Cavallo R, Tetreault J (2015) On yahoo answers, long answers are best. In: Proceedings of CrowdML: The ICML 15 Workshop on Crowdsourcing and Machine Learning
Jakobsen T, Skardal T (2007) Readability index. Agder Univ
Jurczyk P, Agichtein E (2007) Hits on question answer portals: exploration of link analysis for author ranking. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp 845–846
Zhang J, Ackerman MS, Adamic L (2007) Expertise networks in online communities: structure and algorithms. In: Proceedings of the 16th international conference on World Wide Web, pp 221–230
Faisal MS, Daud A, Akram AU, Abbasi RA, Aljohani NR, Mehmood I (2019) Expert ranking techniques for online rated forums. Comput Hum Behav 100:168–176
Yin D et al (2016) Ranking relevance in yahoo search. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 323–332
Yulianti E, Chen RC, Scholer F, Croft WB, Sanderson M (2018) Ranking documents by answer-passage quality. In: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. pp 335–344
Lopes CT, Ribeiro C (2019) Interplay of documents’ readability, comprehension and consumer health search performance across query terminology. In: Proceedings of the 2019 Conference on Human Information Interaction and Retrieval, pp 193–201
Bansal S (2012) Comparison between the probabilistic and vector space model for spam filtering. Int J Comput Intell Tech 3(2):82
Powers DM (2020) Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation. arXiv preprint arXiv:2010.16061
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Banjar, A., Shaheen, A., Amjad, T. et al. Users’ satisfaction based ranking for Yahoo Answers. Multimed Tools Appl 83, 71265–71284 (2024). https://doi.org/10.1007/s11042-024-18433-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-024-18433-3