[go: up one dir, main page]

Skip to content

souvikshanku/tic-tac-toe-zero

Repository files navigation

tic-tac-toe-zero

Implementation of MuZero for Tic-Tac-Toe.

It can play optimally 65-70% of the times if you train long enough and if you are lucky.

RL is hard (ToT)

Example Usage

git clone https://github.com/souvikshanku/tic-tac-toe-zero.git
cd tic-tac-toe-zero

python3 -m venv .venv
source .venv/bin/activate
pip3 install -r requirements.txt

# End-to-end training
python3 self_play.py

# Play against 'random' agent
python3 check_accuracy.py