Abstract
While China has become the largest online market in the world with approximately 1 billion internet users, Baidu runs the world’s largest Chinese search engine serving more than hundreds of millions of daily active users and responding to billions of queries per day. To handle the diverse query requests from users at the web-scale, Baidu has made tremendous efforts in understanding users’ queries, retrieving relevant content from a pool of trillions of webpages, and ranking the most relevant webpages on the top of the results. Among the components used in Baidu search, learning to rank (LTR) plays a critical role and we need to timely label an extremely large number of queries together with relevant webpages to train and update the online LTR models. To reduce the costs and time consumption of query/webpage labelling, we study the problem of active learning to rank (active LTR) that selects unlabeled queries for annotation and training in this work. Specifically, we first investigate the criterion–Ranking entropy (RE) characterizing the entropy of relevant webpages under a query produced by a sequence of online LTR models updated by different checkpoints, using a query-by-committee (QBC) method. Then, we explore a new criterion namely prediction variances (PV) that measures the variance of prediction results for all relevant webpages under a query. Our empirical studies find that RE may favor low-frequency queries from the pool for labelling while PV prioritizes high-frequency queries more. Finally, we combine these two complementary criteria as the sample selection strategies for active learning. Extensive experiments with comparisons to baseline algorithms show that the proposed approach could train LTR models to achieve higher discounted cumulative gain (i.e., the relative improvement ΔDCG4 = 1.38%) with the same budgeted labelling efforts.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Y. Sun, S. H. Wang, Y. K. Li, S. K. Feng, X. Y. Chen, H. Zhang, X. Tian, D. X. Zhu, H. Tian, H. Wu. ERNIE: Enhanced representation through knowledge integration, [Online], Available: https://arxin.org/abs/1904.09223, 2019.
Y. Sun, S. H. Wang, Y. K. Li, S. K. Feng, H. Tian, H. Wu, H. F. Wang. Ernie 2.0: A continual pre-training framework for language understanding. In Proceedings of AAAI Conference on Artificial Intelligence, Palo Alto, USA, pp. 8968–8975, 2020. DOI: https://doi.org/10.1609/aaai.v34i05.6428.
J. Z. Huang, S. Q. Ding, H. F. Wang, T. Liu. Learning to recommend related entities with serendipity for web search users. ACM Transactions on Asian and Low-resource Language Information Processing, vol. 17, no. 3, Article number 25, 2018. DOI: https://doi.org/10.1145/3185663.
J. Z. Huang, W Zhang, Y. M. Sun, H. F. Wang, T. Liu. Improving entity recommendation with search log and multi-task leaning. In Proceedings of the 27th International Joint Conference on Artificial Intelligence, ACM, Stockholm, Sweden, pp.4107–4114, 2018. DOI: https://doi.org/10.5555/3304222.3304341.
J. Z. Huang, H. F. Wang, W. Zhang, T. Liu. Multi-task learning for entity recommendation and document ranking in web search. ACM Transactions on Intelligent Systems and Technology, vol. 11, no. 5, Article number 54, 2020. DOI: https://doi.org/10.1145/3396501.
M. Fan, Y. B. Sun, J. Z. Huang, H. F. Wang, Y. Li. Meta-learned spatial-temporal POI auto-completion for the search engine at baidu maps. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, Singapore, pp. 2822–2830, 2021. DOI: https://doi.org/10.1145/3447548.3467058.
M. Fan, J. C. Gun, S. Zhu, S. Mian, M. M. Sun, P. Li. MOBIUS: Towards the neat generation of query-ad matching in baidu’s sponsored search. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, USA, pp. 2509–2517, 2019. DOI: https://doi.org/10.1145/3292500.3330651.
T. Yu, Y. Yang, Y. Li, X. D. Chen, M. M. Sun, P. Li. Combo-attention network for baidu video advertising. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, USA., pp. 2474–2482, 2020. DOI: https://doi.org/10.1145/3394486.3403297.
J. Ouyang, S. D. Lin, W. Qi, Y. Wang, B. Yu, S. Jiang. SDA: Software-defined acealeraror for large-scale DNN systems. In Proceedings of IEEE Hot Chipe 26 Symposium, Cupertino, USA, 2014. DOI: https://doi.org/10.1109/HOTCHIPS.2014.7478821.
W. J. Zhao, J. Y. Zhang, D. P. Xie, Y. L. Qian, R. L. Jia, P. Li. AIBox: CTR prediction model training on a single node. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, Beijing, China, pp. 319–328, 2019. DOI: https://doi.org/10.1145/3357384.3358045.
J. Ouyang, M. Noh, Y. Wang, W. Qi, Y. Ma, C. H. Gu, S. Kim, K. I. Hong, W. K. Bae, Z. B. Zhao, J. Wang, P. Wu, X. Z. Gong, J. X. Shi, H. F. Zhu, X. L. Du. Baidu kunlun an AI processor for diversified workloads. In Proceedings of IEEE Hot Chips 32 Symposium, Palo Alto, USA, 2020. DOI: https://doi.org/10.1109/HCS49909.2020.9220641.
B. Settles. From theories to queries: Active learning in practice. In Proceedings of the Active Learning and Experimental Design workshop in Conjunction with AISTATS, Ft. Lauderdale, USA, 2011.
D. Cohn, L. Atlas, R. Ladner. Improving generalization with active learning. Machine Learning, vol. 15, no. 2, pp. 201–221, 1994. DOI: https://doi.org/10.1007/BF00993277.
S. Y. Huang, T. Y. Wang, H. Y. Xiong, J. Huan, D. J. Dou. Semi-supervised active leahning with temporal output discrepancy. In Proceedings of IEEE/CVF International Conference on Computer Vision, IEEE, Montreal, Canada, pp. 3427–3436, 2021. DOI: https://doi.org/10.1109/ICCV48922.2021.00343.
Y. Freund, H. S. Seung, E. Shamir, N. Tishby. Selective sampling using the query by committee algorithm. Machine Learning, vol. 28, no. 2, pp. 133–168, 1997. DOI: https://doi.org/10.1023/A:1007330508534.
B. Long, J. Bian, O. Chapelle, Y. Zhang, Y. Inagaki, Y. Chang. Active learning for ranking through expected loss optimization. IEEE Transactions on Knowledge and Data Engineering, vol 27, no. 5, pp. 1180–1191, 2015. DOI: https://doi.org/10.1109/TKDE.2014.2365785.
K. Wei, R Iyer, J. Bilmes. Submodularity in data subset selection and active learning. In Proceedings of the 32nd International Conference on Machine Learning, ACM, Lille, France, pp. 1954–1963, 2015. DOI: https://doi.org/10.5555/3045118.3045326.
S. Hanneke, L. Yang. Minimax analysis of active learning. Journal of Machine Learning Research, vol. 16, no. 12, pp. 3487–3602, 2015.
D. A. Cohn, Z. Ghahramani, M. I. Jordan. Active learning with statistical models. Journal of Artificial Intelligence Research, vol. 4, pp. 129–145, 1996. DOI: https://doi.org/10.1613/jair.295.
H. T. Nguyen, A. Smeulders. Active learning using preclustering. In Proceedings of the 21st International Conference on Machine Learning, ACM, Banff, Canada, pp. 79–86, 2004. DOI: https://doi.org/10.1145/1015330.1015349.
Y. H. Gun. Active instance sampling via matrix partition. In Proceedings of the 23rd International Conference on Neural Information Processing Systems, ACM, Vancouver, Canada, pp. 802–810, 2010. DOI: https://doi.org/10.5555/2997189.2997279.
O. Sener, S. Savarese. Active learning for convolutional neural networks: A core-set approach. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, Canada, 2018.
A. Kapoor, K. Grauman, R. Urtasun, T. Darrell. Active learning with gaussian processes for object categorization. In Proceedings of the 11th International Conference on Computer Vision, IEEE, Rio de Janeiro, Brazil, pp. 1–8, 2007. DOI: https://doi.org/10.1109/ICCV.2007.4408844.
Z. Wang, J. P. Ye. Querying discriminative and representative samples for batch mode active learning. ACM Transactions on Knowledge Discovery from Data, vol. 9, no. 3, Article number 17, 2015. DOI: https://doi.org/10.1145/2700408.
Y. Gal, Z. Ghahramani. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In Proceedings of the 33rd International Conference on Machine Learning, ACM, New York, USA, pp. 1050–1059, 2016. DOI: https://doi.org/10.5555/3045390.3045502.
Y. Gal, R. Islam, Z. Ghahramani. Deep bayesian active learning with image data. In Proceedings of the 34th International Conference on Machine Learning, ACM, Sydney, Australia, pp. 1183–1192, 2017. DOI: https://doi.org/10.5555/3305381.3305504.
S. Ebrahimi, M. Elhoseiny, T. Darrell, M. Rohrbach. Uncertainty-guided continual learning with bayesian neural networks. In Proceedings of the 8th International Conference on Learning Representations, Addis Ababa, Ethiopia, 2020.
X. Y. Zhan, H. Liu, Q. Li, A. B. Chan. A comparative survey: Benchmarking for pool-based active learning. In Proceedings of the 30 th International Joint Conference on Artificial Intelligence, pp. 4679–1686, 2021.
X. Y. Zhan, Q. Z. Wang, K. H. Huang, H. Y. Xiong, D. J. Dou, A. B. Chan. A comparative survey of deep active learning, [Online], Available: https://arxiv.org/abs/2203.13450, 2022.
X. Y. Zhan, Z. Y. Dai, Q Z. Wang, Q. Li, H. Y. Xiong, D. J. Dou, A. B. Chan. Pareto optimization for active learning under out-of-distribution data scenarios, [Online], Available: https://arxiv.org/abs/2207.01190, 2022.
N. Roy, A. McCallum. Toward optimal active learning through monte carlo estimation of error reduction. In Proceedings of the 18th International Conference on Machine Learning, San Francisco, USA, pp. 441–448, 2001.
S. Tong, D. Koller. Support vector machine active learning with applications to text classification. Journal of Machine Learning Research, vol. 2, no. 1, pp. 45–66, 2002. DOI: https://doi.org/10.1162/153244302760185243.
K. Brinker. Incorporating diversity in active learning with support vector machines. In Proceedings of the 20th International Conference on Machine Learning, ACM, Washington, USA, pp. 59–66, 2003. DOI: https://doi.org/10.5555/3041838.3041846.
D. Roth, K. Small. Margin-based active learning for structured output spaces. In Proceedings of the 17th European Conference on Machine Learning, Springer, Berlin, Germany, pp. 413–424, 2006.
B. Settles, M. Craven. An analysis of active learning strategies for sequence labeling tasks. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, ACM, Honolulu, USA, pp. 1070–1079, 2008. DOI: https://doi.org/10.5555/1613715.1613855.
A. J. Joshi, F. Porikli, N. Papanikolopoulos. Multi-class active learning for image classification. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Miami, USA, pp. 2372–2379, 2009. DOI: https://doi.org/10.1109/CVPR.2009.5206627.
W. J. Luo, A. G. Schwing, R. Urtasun. Latent structured active learning. In Proceedings of the 26th International Conference on Neural Information Processing Systems, ACM, Lake Tahoe, USA, pp. 728–736, 2013. DOI: https://doi.org/10.5555/2999611.2999693.
W. B. Cai, Y. Zhang, S. Y. Zhou, W. Q. Wang, C. Ding, X. Gu. Active learning for support vector machines with maximum model change. In Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases, Springer, Nancy, France, pp. 211–226, 2014. DOI: https://doi.org/10.1007/978-3-662-44848-914.
W. B. Cai, Y. Zhang, J. Zhou. Maximizing expected model change for active learning in regression. In Proceedings of the 13th IEEE International Conference on Data Mining, Dallas, USA, pp. 51–60, 2013. DOI: https://doi.org/10.1109/ICDM.2013.104.
B. Settles, M. Craven, S. Ray. Multiple-instance active learning. In Proceedings of the 20th International Conference on Neural Information Processing Systems, ACM, Vancouver, Canada, pp. 1289–1296, 2007. DOI: https://doi.org/10.5555/2981562.2981724.
B. Long, O. Chapelle, Y. Zhang, Y. Chang, Z. H. Zheng, B. Tseng. Active learning for ranking through expected loss optimization. In Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM, Geneva, Switzerland, pp. 267–274, 2010. DOI: https://doi.org/10.1145/1835449.1835495.
M. Bilgic, P. N. Bennett. Active query selection for learning rankers. In Proceedings of the 35th international ACM SIGIR Conference on Research and Development in Information Retrieval, ACM, Portland, USA, pp. 1033–1034, 2012. DOI: https://doi.org/10.1145/2348283.2348455.
W. B. Cai, M. H. Zhang, Y. Zhang. Active learning for ranking with sample density. Information Retrieval Journal, vol. 18, no). 2, pp. 123–144, 2015. DOI: https://doi.org/10.1007/s10791-015-9250-6.
M Taylor, J Guiver, S Robertson, T Minka SoftRank: Optimizing non-smooth rank metrics. In Proceedings of International Conference on Web Search and Data Mining, ACM, Palo Alto, USA, pp. 77–86, 2008. DOI: https://doi.org/10.1145/1341531.1341544.
Z. H. Zheng, K. K. Chen, G. Sun, H. Y. Zha. A regression framework for tearing ranking functions using relative relevance judgments. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Amsterdam, The Netherlands, pp.287–244, 2007. DOI: https://doi.org/10.1145/1277741.1277792.
O. Chapelle, Y. Chang. Yahoo! Learning to rank challenge overview. In Proceedings of International Conference on Yahoo! Learning to Rank Challenge, ACM, Haifa, Israel, 2011.
T. Qin, T. Y. Liu, J. Xu, H. Li. LETOR: A benchmark collection for research on learning to rank for information retrieval. Information Retrieval, vol. 13, no. 4, pp. 346–374, 2010. DOI: https://doi.org/10.1007/s10791-009-9123-y.
Q. Y. Ai, K P. Bi, C Luo, J. F. Guo, W. B. Croft. Unbiased learning to rank with unbiased propensity estimation. In Proceedings of the 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, Ann Arbor, USA, pp. 385–394, 2018. DOI: https://doi.org/10.1145/32099783209986
L. X. Zou, W. X. Lu, Y. D. Liu, H. Y. Cai, X. K. Chu, D. H. Ma, D. T. Shi, Y. Sun, Z. C. Cheng, S. M. Gu, S. Q. Wang, D. W. Yin Pre-trained language model-based retrieval and ranking for web search. ACM Transactions on the Web, vol. 17, no. 1, Article number 4, 2022. DOI: https://doi.org/10.1145/3568681.
M. C. Zhuge, D. H. Gao, D. P. Fan, L. B. Jin, B. Chen, H. M. Zhou, M. H. Qiu, L. Shao. Kaleido-BERT: Vision-language pre-training on fashion domain. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Nashville, USA, pp.12642–12652, 2021. DOI: https://doi.org/10.1109/CVPR46437.2021.01246.
X. S. Luo, L. X. Liu, Y. H. Yang, L. Bo, Y. P. Cao, J. H. Wu, Q. Li, K. P. Yang, K. Q. Zhu. AliCoCo: Alibaba E-commerce cognitive concept net. In Proceedings of ACM SIGMOD International Conference on Management of Data, Portland, USA, pp. 313–327, 2020. DOI: https://doi.org/10.1145/3318464.3386132.
L. Ouyang, J. Wu, X. Jiang, D. Almeida, C. Wainwright, P. Mishkin, C. Zhang, S. Agarwal, K. Slama, A. Gray, J. Schulman, J. Hilton, F. Kelton, L. Miller, M. Simens, A. Askell, P. Welinder, P. Christiano, J. Leike, R. Lowe. Training language models to follow instructions with human feedback. In Proceedings of the 36th Conference on Neural Information Processing Systems, 2022.
Q. Z. Wang, A. B. Chan. Describing like humans: On diversity in image captioning. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Long Beach, USA, pp. 4190–1198, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00432.
Q. Z. Wang, J. N. Wang, A. B. Chan, S. Y. Huang, H. Y. Xiong, X. J. Li, D. J. Dou. Neighbours matter: Image captioning with similar images. In Proceedings of the 31st British Machine Vision Conference, 2020.
J. N. Wang, W. J. Xu, Q. Z. Wang, A. B. Chan. Compare and reweight: Distinctive image captioning using similar images sets. In Proceedings of the 16th European Conference on Computer Vision, Springer, Glasgow, UK, pp. 370–386, 2020. DOI: https://doi.org/10.1007/978-3-030-58452-8_22.
H. Zhang, W. C. Yin, Y. W. Fang, L. X. Li, B. Q. Duan, Z. H. Wu, Y. Sun, H. Tian, H. Wu, H. F. Wang. Ernie-vilg: Unified generative pre-training for bidirectional vision-language generation, [Online], Available: https://arxiv.org/abs/2112.15283, 2021.
Q. Z. Wang, J. Wan, A. B. Chan. On diversity in image captioning: Metrics and methods. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 2, pp. 1035–1049, 2022. DOI: https://doi.org/10.1109/TPAMI.2020.3013834.
J. N. Wang, W. J. Xu, Q. Z. Wang, A. B. Chan. On distinctive image captioning via comparing and reweighting. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 2, pp. 2088–2103, 2023. DOI: https://doi.org/10.1109/TPAMI.2022.3159811.
Acknowledgements
This work was supported in part by the National Key R&D Program of China (No. 2021ZD0110303).
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
The authors declared that they have no conflicts of interest to this work.
Additional information
Colored figures are available in the online version at https://link.springer.com/journal/11633
Qingzhong Wang received the B. Eng. degree in automation, the M. Eng. in control science and engineering from Harbin Engineering University, China in 2013 and 2016, respectively, and the Ph. D. degree in computer science from City University of Hong Kong, China in 2021. He is now a researcher in Big Data Laboratory, Baidu Research, China.
His research interests include computer vision and vision-language learning.
Haifang Li received the B. Eng. degree in mathematics from Shandong University, China in 2011, and the Ph. D. degree in mathematics from University of Chinese Academy of Sciences, China in 2016. She is now a senior algorithm engineer at Baidu Inc., China. Before that, she was an assistant researcher at Institute of Automation, Chinese Academy of Sciences, China.
Her research interests include information retrieval and data mining.
Haoyi Xiong received the Ph. D. degree in computer science from Telecom SudParis and Pierre and Marie Curie University, France in 2015. He is currently a principal architect at Big Data Laboratory, Baidu Inc., China. From 2016 to 2018, he was a Tenure-track assistant professor with Department of Computer Science, Missouri University of Science and Technology, USA. Before that, he was a postdoc at University of Virginia, USA from 2015 to 2016. He has published more than 70 papers in top computer science conferences and journals. He was a co-recipient of the 2020 IEEE TCSC Award for Excellence in Scalable Computing (Early Career Researcher) and the prestigious Science & Technology Advancement Award (First Prize) from Chinese Institute of Electronics in 2019.
His research interests include AutoDL and ubiquitous computing.
Wen Wang received the Ph. D. degree in computer science from Department of Software Engineering, East China Normal University, China in 2021. He is now a senior algorithm engineer at Baidu Inc., China.
His research interests include information retrieval and recommendation systems.
Jiang Bian received the B. Eng. degree in logistics systems engineering from Huazhong University of Science and Technology, China in 2014, the M. Sc. degree in industrial systems engineering from University of Florida, USA in 2020, and the Ph. D. degree in computer science from University of Central Florida, USA in 2020. He is a researcher in Baidu Research, China.
His research interests include internet of things, sports analytics and ubiquitous computing.
Yu Lu received the B. Eng. degree in computer science from Xidian University, China in 2010, and the M. Eng. degree in computer science from Xi’an Jiaotong University, China in 2014. He is now a senior algorithm engineer at Baidu Inc., China.
His research interests include information retrieval and data mining.
Shuaiqiang Wang received the B. Sc. and Ph. D. degrees in computer science from Shandong University, China in 2004 and 2009, respectively. He visited Hong Kong Baptist University, China, as an exchange doctoral student in 2009. He is currently a principal algorithm engineer at Baidu Inc., China, leading the Web Search Ranking Strategy Group that advances the document ranking for the Baidu Search Engine. Previously, he was a research scientist and senior algorithm engineer at JD inc., China, taking responsibility for the feed recommendation at JD.com. Before that, he worked as an assistant professor at University of Manchester, UK in 2017 and University of Jyvaskyla in Finland, from 2014 to 2017 respectively. Earlier, he served as an associate professor at Shandong University of Finance and Economics, China from 2011 to 2014, and a postdoctoral researcher at Texas State University, USA from 2010 to 2011. He served as Senior PC Member of IJCAI, and PC Member of WWW, SIGIR and WSDM in recent years. He published over 50 papers in leading journals and conferences.
His research interests include information retrieval, recommendation systems and data mining.
Zhicong Cheng received M. Sc. degree in computer science from Peking University, China in 2011. He is now a distinguished architect at Baidu Incorporated, China.
His research interests include learning to rank, machine learning, information retrieval and question answering.
Dejing Dou received the B. Eng. degree in electronic engineering from Tsinghua University, China in 1996, and the Ph. D. degree in artificial intelligence from Yale University, USA in 2004. He is a professor with Computer and Information Science Department, University of Oregon, USA and the lead of the Advanced Integration and Mining Laboratory (AIM Laboratory). He is also the director of the NSF IUCRC Center for Big Learning (CBL).
His research interests include artificial intelligence, data mining, data integration, information extraction, biomedical and health informatics.
Dawei Yin received the B. Sc. degree in computer science from Shandong University, China in 2006, the M. Sc. and Ph. D. degrees in computer science from Lehigh University, USA in 2010 and 2013, respectively. From 2007 to 2008, he was an M.Phil. student in The University of Hong Kong, China. He is senior director of Engineering at Baidu Incorporated, China. He is managing the search science team at Baidu, leading Baidu’s science efforts of web search, question answering, video search, image search, news search, app search, etc. Previously, he was senior director, managing the recommendation engineering team at JD.com between 2016 and 2020. Prior to JD.com, he was senior research manager at Yahoo Labs, leading relevance science team and in charge of Core Search Relevance of Yahoo Search. He published more than 100 research papers in premium conferences and journals, and was the recipients of WSDM2016 Best Paper Award, KDD2016 Best Paper Award, WSDM2018 Best Student Paper Award.
His research interests include data mining, applied machine learning, information retrieval and recommender systems.
Rights and permissions
About this article
Cite this article
Wang, Q., Li, H., Xiong, H. et al. A Simple yet Effective Framework for Active Learning to Rank. Mach. Intell. Res. 21, 169–183 (2024). https://doi.org/10.1007/s11633-023-1422-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11633-023-1422-z