[go: up one dir, main page]

Skip to content

Play the board game Santorini with this Reinforcement Learning agent and custom Gym environment

License

Notifications You must be signed in to change notification settings

pranavsb/santorini-RL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

42 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Santorini RL

Play the board game Santorini with this Reinforcement Learning agent and custom Gym environment

This is a PettingZoo environment (similar to OpenAI Gym for multi-agent tasks) for the board game Santorini.

Using this environment, different RL techniques like PPO and DQN to solve Santorini are attempted.

See /env for PettingZoo environment and /algorithms for RL code and models.

Environment:

Reuseable PettingZoo environment (similar to OpenAI Gym) for the Santorini board game. Official rules can be found at: https://roxley.com/products/santorini

This is a classic environment since Santorini is a turn-based board game.
Only ASCII env render is supported and 2 players (agents) play against each other.

Note that we currently have three modifications to the official game rules:
1. We randomly place the worker pieces at the start of the game.
2. We don't support God powers.
3. We have an unlimited number of building pieces.

None of the modifications should have a significant effect on gameplay or favor a particular player.

|--------------------|-----------------------------------------------|
| Actions            | Discrete                                      |
| Parallel API       | No                                            |
| Manual Control     | No                                            |
| Agents             | `agents= ['player_1', 'player_2']`            |
| Agents             | 2                                             |
| Action Shape       | (1)                                           |
| Action Values      | [0, 127]                                      |
| Observation Shape  | (3, 5, 5)                                     |
| Observation Values | [0, 4]                                        |


Action space is 2 * 8 * 8 which represents choice of worker piece, direction to move and direction to build.
This is represented as a Discrete(128) action space. Creating a single discrete instead of MultiDiscrete space since
that's what many standard implementations have done. For example, the Chess implementation by PettingZoo.

Observation space consists of three 5x5 planes, represented as Box(3, 5, 5). The first 5x5 plane is 1 for the
agent's worker pieces and 0 otherwise. The second 5x5 plane is 1 for the opponent's worker pieces and 0 otherwise.
The third 5x5 plane represents the height of the board at a given cell in the grid - this ranges from 0 (no buildings) to 4 (dome).

Reward is 1 for winning, -1 for losing.

About

Play the board game Santorini with this Reinforcement Learning agent and custom Gym environment

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages