Xiaobao Wu

Hi, I’m now an Assistant Professor at School of Computer Science, Shanghai Jiao Tong University. I received my Ph.D. degree from Nanyang Technological University, Master’s degree from Tsinghua University, and Bachelor’s degree from Southeast University. I was a visiting scholar at University of California, Santa Barbara. My research interests lie mostly in the area of natural language processing, artificial intelligence, and large language models.

吴小宝博士，上海交通大学计算机学院助理教授、博士生导师，上海市海外高层次青年人才。博士毕业于新加坡南洋理工大学，硕士和本科毕业于清华大学和东南大学。曾任加州大学圣巴巴拉分校访问学者，南洋理工大学博士后研究员。主要研究方向包括人工智能、自然语言处理、大语言模型等。在NeurIPS、ICML、ICLR、ACL、EMNLP等国际顶级会议与期刊上累计发表论文40余篇，长期担任ACL、EMNLP等国际会议领域主席，一作论文曾获ACL 2025领域主席奖。

📢 Join us!

We are seeking highly self-motivated students (Interns, Master, and PhD) with a strong passion for Natural Language Processing (NLP) and Large Language Models (LLMs). If you are interested in joining our group, please send your CV and transcripts to this email. Please format the subject line as: [Prospective Position]-[Name]-[University].

课题组长期招收具有强自我驱动力的优秀学生（包括实习生、硕士生、博士生）。如果你对自然语言处理及大语言模型的前沿研究充满热情，欢迎通过邮件联系。请随信附上个人简历及成绩单，邮件标题请注明：申请岗位-姓名-学校。

News

Jan, 2026	Joined SJTU as an Assistant Professor.
Jul, 2025	AntiLeakBench won the Senior Area Chair Highlights Award (0.56%) of ACL 2025!
May, 2025	7 papers accepted to ACL 2025. We release a survey on learning from rewards, including reinforcement learning (in RLHF, DPO, and GRPO), reward-guided decoding, and post-hoc correction. One paper accepted to ICML 2025.
Feb, 2025	Invited to serve as Area Chair for ACL 2025. One paper accepted to NAACL 2025.
Dec, 2024	Succesfully defend my PhD thesis! Two papers accepted to AAAI 2025! One paper accepted to COLING 2025! One paper accepted to ACM/SAC 2025! One paper accepted to TMLR!
Sep, 2024	One paper accepted to NeurIPS 2024 and Three papers accepted to EMNLP 2024 main conference!
Jul, 2024	One paper accepted to ECCV 2024!
Jun, 2024	Two papers accepted to ACL 2024 (one findings and one demo)!
Mar, 2024	One paper accepted to NAACL 2024!
Jan, 2024	Our Neural Topic Modeling Survey Paper got accepted to Artificial Intelligence Review!
Dec, 2023	Two papers accepted to AAAI 2024!

Selected Publications

ACL Oral Presentation

AntiLeakBench: Preventing Data Contamination by Automatically Constructing Benchmarks with Updated Real-World Knowledge

Xiaobao Wu, Liangming Pan, Yuxi Xie, Ruiwen Zhou, Shuai Zhao, Yubo Ma, Mingzhe Du, Rui Mao, Anh Tuan Luu, and William Yang Wang

In Annual Meeting of the Association for Computational Linguistics (ACL), 2025

🏆Senior Area Chair Highlights Award

Bib PDF Code

@inproceedings{wu2025antileak,
  title = {AntiLeakBench: Preventing Data Contamination by Automatically Constructing Benchmarks with Updated Real-World Knowledge},
  author = {Wu, Xiaobao and Pan, Liangming and Xie, Yuxi and Zhou, Ruiwen and Zhao, Shuai and Ma, Yubo and Du, Mingzhe and Mao, Rui and Luu, Anh Tuan and Wang, William Yang},
  booktitle = {Annual Meeting of the Association for Computational Linguistics (ACL)},
  year = {2025},
  url = {https://arxiv.org/pdf/2412.13670.pdf},
}

NeurIPS

FASTopic: Pretrained Transformer is a Fast, Adaptive, Stable, and Transferable Topic Model

Xiaobao Wu, Thong Nguyen, Delvin Ce Zhang, William Yang Wang, and Anh Tuan Luu

In Neural Information Processing Systems (NeurIPS), 2024

Bib PDF Code

@inproceedings{wu2024fastopic,
  title = {FASTopic: Pretrained Transformer is a Fast, Adaptive, Stable, and Transferable Topic Model},
  author = {Wu, Xiaobao and Nguyen, Thong and Zhang, Delvin Ce and Wang, William Yang and Luu, Anh Tuan},
  booktitle = {Neural Information Processing Systems (NeurIPS)},
  year = {2024},
  url = {https://arxiv.org/pdf/2405.17978},
}

EMNLP

AKEW: Assessing Knowledge Editing in the Wild

Xiaobao Wu, Liangming Pan, William Yang Wang, and Anh Tuan Luu

In Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Bib PDF Code

@inproceedings{wu2024akew,
  title = {AKEW: Assessing Knowledge Editing in the Wild},
  author = {Wu, Xiaobao and Pan, Liangming and Wang, William Yang and Luu, Anh Tuan},
  year = {2024},
  booktitle = {Conference on Empirical Methods in Natural Language Processing (EMNLP)},
  url = {https://arxiv.org/pdf/2402.18909},
}

EMNLP

Are LLMs Good Zero-shot Fallacy Classifiers?

Fengjun Pan^#, Xiaobao Wu^#, Zongrui Li, and Anh Tuan Luu

In Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024

Bib PDF Code

@inproceedings{pan2024fallacy,
  title = {Are LLMs Good Zero-shot Fallacy Classifiers?},
  author = {Pan, Fengjun and Wu, Xiaobao and Li, Zongrui and Luu, Anh Tuan},
  year = {2024},
  booktitle = {Conference on Empirical Methods in Natural Language Processing (EMNLP)},
  url = {https://arxiv.org/pdf/2410.15050},
}

AIR

A Survey on Neural Topic Models: Methods, Applications, and Challenges

Xiaobao Wu, Thong Nguyen, and Anh Tuan Luu

Artificial Intelligence Review (AIR), 2024

Bib PDF Code

@article{wu2024survey,
  title = {A Survey on Neural Topic Models: Methods, Applications, and Challenges},
  author = {Wu, Xiaobao and Nguyen, Thong and Luu, Anh Tuan},
  journal = {Artificial Intelligence Review (AIR)},
  url = {https://doi.org/10.1007/s10462-023-10661-7},
  year = {2024},
  publisher = {Springer}
}

ACL Demo

Towards the TopMost: A Topic Modeling System Toolkit

Xiaobao Wu, Fengjun Pan, and Anh Tuan Luu

In Annual Meeting of the Association for Computational Linguistics: System Demonstration Track (ACL Demo), 2024

Bib PDF Code

@inproceedings{wu2024topmost,
  title = {Towards the TopMost: A Topic Modeling System Toolkit},
  author = {Wu, Xiaobao and Pan, Fengjun and Luu, Anh Tuan},
  year = {2024},
  booktitle = {Annual Meeting of the Association for Computational Linguistics: System Demonstration Track (ACL Demo)},
  url = {https://arxiv.org/pdf/2309.06908.pdf},
}

ICML

Effective Neural Topic Modeling with Embedding Clustering Regularization

Xiaobao Wu, Xinshuai Dong, Thong Nguyen, and Anh Tuan Luu

In International Conference on Machine Learning (ICML), 2023

Bib PDF Code

@inproceedings{wu2023effective,
  booktitle = {International Conference on Machine Learning (ICML)},
  organization = {PMLR},
  title = {Effective Neural Topic Modeling with Embedding Clustering Regularization},
  author = {Wu, Xiaobao and Dong, Xinshuai and Nguyen, Thong and Luu, Anh Tuan},
  year = {2023},
  url = {https://arxiv.org/pdf/2306.04217},
}

ACL

Fact-Checking Complex Claims with Program-Guided Reasoning

Liangming Pan, Xiaobao Wu, Xinyuan Lu, Anh Tuan Luu, William Yang Wang, Min-Yen Kan, and Preslav Nakov

In Annual Meeting of the Association for Computational Linguistics (ACL), 2023

Bib PDF Code

@inproceedings{pan2023fact,
  booktitle = {Annual Meeting of the Association for Computational Linguistics (ACL)},
  title = {Fact-Checking Complex Claims with Program-Guided Reasoning},
  author = {Pan, Liangming and Wu, Xiaobao and Lu, Xinyuan and Luu, Anh Tuan and Wang, William Yang and Kan, Min-Yen and Nakov, Preslav},
  year = {2023},
  address = {Toronto, Canada},
  publisher = {Association for Computational Linguistics},
  url = {https://aclanthology.org/2023.acl-long.386},
  pages = {6981--7004},
}

EMNLP

Mitigating Data Sparsity for Short Text Topic Modeling by Topic-Semantic Contrastive Learning

Xiaobao Wu, Anh Tuan Luu, and Xinshuai Dong

In Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022

Bib PDF Code

@inproceedings{wu2022mitigating,
  title = {Mitigating Data Sparsity for Short Text Topic Modeling by Topic-Semantic Contrastive Learning},
  author = {Wu, Xiaobao and Luu, Anh Tuan and Dong, Xinshuai},
  year = {2022},
  booktitle = {Conference on Empirical Methods in Natural Language Processing (EMNLP)},
  publisher = {Association for Computational Linguistics},
  address = {Abu Dhabi, United Arab Emirates},
  pages = {2748--2760},
  url = {https://aclanthology.org/2022.emnlp-main.176},
}

EMNLP

Short Text Topic Modeling with Topic Distribution Quantization and Negative Sampling Decoder

Xiaobao Wu, Chunping Li, Yan Zhu, and Yishu Miao

In Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020

Bib PDF Code

@inproceedings{wu2020short,
  title = {Short Text Topic Modeling with Topic Distribution Quantization and Negative Sampling Decoder},
  author = {Wu, Xiaobao and Li, Chunping and Zhu, Yan and Miao, Yishu},
  year = {2020},
  booktitle = {Conference on Empirical Methods in Natural Language Processing (EMNLP)},
  address = {Online},
  pages = {1772--1782},
  url = {https://aclanthology.org/2020.emnlp-main.138.pdf},
}

Xiaobao Wu

Assistant Professor

School of Computer Science, Shanghai Jiao Tong University

上海交通大学计算机学院 助理教授

📢 Join us!

News

Selected Publications

上海交通大学计算机学院助理教授