[go: up one dir, main page]

Skip to content

CORL (Clean Offline Reinforcement Learning)

Twitter arXiv Ruff

๐Ÿงต CORL is an Offline Reinforcement Learning library that provides high-quality and easy-to-follow single-file implementations of SOTA offline reinforcement learning algorithms. Each implementation is backed by a research-friendly codebase, allowing you to run or tune thousands of experiments. Heavily inspired by cleanrl for online RL, check them out too! The highlight features of CORL are:

  • ๐Ÿ“œ Single-file implementation
  • ๐Ÿ“ˆ Benchmarked Implementation (11+ offline algorithms, 5+ offline-to-online algorithms, 30+ datasets with detailed logs )
  • ๐Ÿ–ผ Weights and Biases integration

You can read more about CORL design and main results in our technical paper.

Tip

โญ If you're interested in __discrete control__, make sure to check out our new library โ€” [Katakomba](https://github.com/corl-team/katakomba). It provides both discrete control algorithms augmented with recurrence and an offline RL benchmark for the NetHack Learning environment.

Info

**Minari** and **Gymnasium** support: [Farama-Foundation/Minari](https://github.com/Farama-Foundation/Minari) is the
next generation of D4RL that will continue to be maintained and introduce new features and datasets. 
Please see their [announcement](https://farama.org/Announcing-Minari) for further detail. 
We are currently slowly migrating to the Minari and the progress
can be tracked [here](https://github.com/corl-team/CORL/issues/2). This will allow us to significantly update dependencies 
and simplify installation, and give users access to many new datasets out of the box!

Warning

CORL (similarily to CleanRL) is not a modular library and therefore it is not meant to be imported.
At the cost of duplicate code, we make all implementation details of an ORL algorithm variant easy 
to understand. You should consider using CORL if you want to 1) understand and control all implementation details 
of an algorithm or 2) rapidly prototype advanced features that other modular ORL libraries do not support.

Algorithms Implemented

Algorithm Variants Implemented Wandb Report
Offline and Offline-to-Online
โœ… Conservative Q-Learning for Offline Reinforcement Learning
(CQL)
offline/cql.py
finetune/cql.py
docs
Offline
Offline-to-online
โœ… Accelerating Online Reinforcement Learning with Offline Datasets
(AWAC)
offline/awac.py
finetune/awac.py
docs
Offline
Offline-to-online
โœ… Offline Reinforcement Learning with Implicit Q-Learning
(IQL)
offline/iql.py
finetune/iql.py
docs
Offline
Offline-to-online
Offline-to-Online only
โœ… Supported Policy Optimization for Offline Reinforcement Learning
(SPOT)
finetune/spot.py
docs
Offline-to-online
โœ… Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning
(Cal-QL)
finetune/cal_ql.py
docs
Offline-to-online
Offline only
โœ… Behavioral Cloning
(BC)
offline/any_percent_bc.py
docs
Offline
โœ… Behavioral Cloning-10%
(BC-10%)
offline/any_percent_bc.py
docs
Offline
โœ… A Minimalist Approach to Offline Reinforcement Learning
(TD3+BC)
offline/td3_bc.py
docs
Offline
โœ… Decision Transformer: Reinforcement Learning via Sequence Modeling
(DT)
offline/dt.py
docs
Offline
โœ… Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble
(SAC-N)
offline/sac_n.py
docs
Offline
โœ… Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble
(EDAC)
offline/edac.py
docs
Offline
โœ… Revisiting the Minimalist Approach to Offline Reinforcement Learning
(ReBRAC)
offline/rebrac.py
docs
Offline
โœ… Q-Ensemble for Offline RL: Don't Scale the Ensemble, Scale the Batch Size
(LB-SAC)
offline/lb_sac.py
docs
Offline Gym-MuJoCo

Citing CORL

If you use CORL in your work, please use the following bibtex

@inproceedings{
tarasov2022corl,
  title={CORL: Research-oriented Deep Offline Reinforcement Learning Library},
  author={Denis Tarasov and Alexander Nikulin and Dmitry Akimov and Vladislav Kurenkov and Sergey Kolesnikov},
  booktitle={3rd Offline RL Workshop: Offline RL as a ''Launchpad''},
  year={2022},
  url={https://openreview.net/forum?id=SyAS49bBcv}
}