tic-tac-toe-zero

Implementation of MuZero for Tic-Tac-Toe.

It can play optimally 65-70% of the times if you train long enough and if you are lucky.

Example Usage

git clone https://github.com/souvikshanku/tic-tac-toe-zero.git
cd tic-tac-toe-zero

python3 -m venv .venv
source .venv/bin/activate
pip3 install -r requirements.txt

# End-to-end training
python3 self_play.py

# Play against 'random' agent
python3 check_accuracy.py

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
checkpoints		checkpoints
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
check_accuracy.py		check_accuracy.py
game.py		game.py
mcts.py		mcts.py
models.py		models.py
replay_buffer.py		replay_buffer.py
requirements.txt		requirements.txt
self_play.py		self_play.py
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

tic-tac-toe-zero

Example Usage

About

Languages

License

souvikshanku/tic-tac-toe-zero

Folders and files

Latest commit

History

Repository files navigation

tic-tac-toe-zero

Example Usage

About

Topics

Resources

License

Stars

Watchers

Forks

Languages