SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning

Peixian Ma^1,2 Xialie Zhuang^1,3 Chengjin Xu^1,4 Xuhui Jiang^1,4 Ran Chen¹ Jian Guo¹

¹IDEA Research, International Digital Economy Academy ²The Hong Kong University of Science and Technology (Guangzhou) ³University of Chinese Academy of Science ⁴DataArc Tech Ltd.

Contact: pma929@connect.hkust-gz.edu.cn, xuchengjin@idea.edu.cn

🔥 Our work is accepted by NeurIPS 2025. Welcome to star and cite our work! ✨

📖 Overview

Natural Language to SQL (NL2SQL) enables intuitive interactions with databases by transforming natural language queries into structured SQL statements. Despite recent advancements in enhancing human-computer interaction within database applications, significant challenges persist, particularly regarding the inference performance in complex scenarios involving multi-table joins and nested queries. Current methodologies primarily utilize supervised fine-tuning (SFT) to train the NL2SQL model, which may limit adaptability and interpretability in new environments (e.g., finance and healthcare). In order to enhance the reasoning performance of the NL2SQL model in the above complex situations, we introduce SQL-R1, a novel NL2SQL reasoning model trained by the reinforcement learning (RL) algorithms. We design a specialized RL-based reward function tailored for NL2SQL tasks and discussed the impact of cold start on the effectiveness of intensive training. In addition, we achieve competitive accuracy using only a tiny amount of synthetic NL2SQL data for augmented training and further explore data engineering for RL. In existing experiments, SQL-R1 achieves execution accuracy of 88.6% and 67.1% on the benchmark Spider and BIRD, respectively.

Figure 1: Demonstration of our work.

📚 Citations

@article{ma2025sql,
  title={SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning},
  author={Ma, Peixian and Zhuang, Xialie and Xu, Chengjin and Jiang, Xuhui and Chen, Ran and Guo, Jian},
  journal={arXiv preprint arXiv:2504.08600},
  year={2025}
}

📰 News

[2025.09.18] 🎉 SQL-R1 is accept by NeurIPS 2025! We will soon update the full version of the paper and poster. Welcome to star and cite our work!
[2025.05.27] 🎉 We have released the full version of SQL-R1.
[2025.05.21] 🎉 We have released our model weights on HuggingFace! Check out the Model Weights section below.
[2025.04.11] 📑 Our paper is now available on arXiv.

🚀 Coming Soon Checklist

📝 Update the camera-ready version of the paper, homepage and poster. coming sooooon!
📊 Release model weights on HuggingFace and ModelScope
🔧 Open source training code and RL dataset
📝 Detailed documentation
🛠️ Environment setup guide

🤖 Model Weights

We are excited to release our SQL-R1 model weights! You can find them on HuggingFace:

Model	Size	HuggingFace Link	ModelScope Link
SQL-R1 (3B)	3B	🤗 Download	-
SQL-R1 (7B)	7B	🤗 Download	🤖 Download
SQL-R1 (14B)	14B	🤗 Download	-

📑 Documentation Structure

This repository is organized as follows:

SQL-R1/
├── 📁 data/                             # Datasets and Databases
│   ├── 📁 Spider/      
│   └── 📁 BIRD/        
├── 📁 models/                           # Foundation models or checkpoints
│   ├── 📁 Qwen2.5-Coder-3B-Instruct/   
│   └── 📁 Qwen2.5-Coder-7B-Instruct/   
├── 📁 db_info/                          # Database information files (Just for inference)
├── 📁 example_data/                     # Example data (Training)
├── 📁 sh/                               # Scripts for data processing, training, inference and evaluation
│   ├── 📄 train.sh
│   ├── 📄 inference.sh
│   ├── 📄 eval_spider.sh
│   └── 📄 eval_bird.sh
├── 📁 src/                              # Source code
│   ├── 📁 data_preprocess/
│   ├── 📁 evaluations/
│   ├── 📁 utils/
│   ├── 📄 inference.py
│   └── 📄 evaluation_*.py
├── 📁 verl/                             # Verl reinforcement learning framework
├── 📄 requirements.txt
└── 📄 README.md

🛠️ Environment Setup

Note

Before getting started, make sure your computing environment supports the following settings:

Environment: Python 3.9+
CUDA Version: 12.0+ (for verl and vllm integration)
GPU Prerequisites: 8 x 80GB+ GPU (for training) / 2 x 40GB GPU (for inference)

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
.github/workflows		.github/workflows
db_info		db_info
docker		docker
docs		docs
example_data		example_data
images		images
patches		patches
sh		sh
src		src
static		static
verl		verl
.gitignore		.gitignore
.nojekyll		.nojekyll
LICENSE		LICENSE
README.md		README.md
index.html		index.html
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning

📖 Overview

📚 Citations

📰 News

🚀 Coming Soon Checklist

🤖 Model Weights

📑 Documentation Structure

🛠️ Environment Setup

Installation

🚀 Quick Start

🌟 Applications

Thanks for

Star History

About

Uh oh!

Releases

Packages

Languages

License

MPX0222/SQL-R1

Folders and files

Latest commit

History

Repository files navigation

SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning

📖 Overview

📚 Citations

📰 News

🚀 Coming Soon Checklist

🤖 Model Weights

📑 Documentation Structure

🛠️ Environment Setup

Installation

🚀 Quick Start

🌟 Applications

Thanks for

Star History

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages