Counterfactual Regret Minimization (usually abbreviated as "CFR") is a technique for solving imperfect-information games, which are games in which some information is hidden from players. For example, in most card games, players are dealt cards that they keep hidden from the other players. Solving such a game means finding a Nash equilibrium, which is a strategy that cannot be improved further by incremental changes.
Cfrm is a .NET library for applying CFR to a game of two or more players. To solve such a game, you create a concrete class that inherits from an abstract base class called GameState<TAction> that defines the state of the game from the current player's point of view (this state is known as an "information set" in game theory). The type parameter TAction defines the type of actions available in the game (e.g. playing a card, making a bet, etc.). This can be an enum or any other type (such as a Card class).
If you are working in C#, the methods of GameState that you must override are:
int CurrentPlayerIdxThis is the 0-based index of the current player. For example, if there are four players in a game, their indexes are0,1,2, and3. Players do not necessarily play in that order, though. (E.g. In trick-taking games, the player who takes a trick typically leads on the next trick.)float[] TerminalValuesIf the game is in a terminal state (i.e. the game is over), this member answers an array of "payoff" values for each player. For example, in a two-player game, if player 0 wins, she might receive 1 point, while player 1 would receive -1 point for losing, resulting in a payoff array of[ 1, -1]for that outcome. If the sum of the payoffs in a terminal state is always 0, then the game is "zero-sum". If the game is not over, then this member answersnull.TAction[] LegalActionsIf the game is not in a terminal state, this member answers an array of legal actions for the player whose turn it is in the current state. For example, in a poker game, these actions might beBetorFold, while in Bridge, the legal actions would be specific bids or cards.GameState<TAction> AddAction(TAction action)This method advances the game to the next state by taking the given action on behalf of the current player.string KeyThis member answers a string that uniquely describes the state of the game from the point of view of the current player. For example, in a card game, the unique key might contain the history of the game to this point (i.e. all the cards played so far, by all players) plus the hidden cards remaining in the current player's hand.
You can think of GameState as representing all the possible nodes in a game's "move tree". Key is a node's unique ID, LegalActions are the branches leading to the node's child nodes, and AddAction moves from a node to one of its children.
Once your concrete GameState class is ready, you can run CFR by invoking the static member CounterFactualRegret.Minimize, which takes the following arguments:
numIterations: The number of CFR iterations to run. Each iteration corresponds to a single play-through of the game.numPlayers: The number of players in the game.getInitialState: This is a callback function that initializes a new game. This initialization can be random (corresponding to shuffling the deck in a card game), or it can move sequentially through all possible initial states in order.
The Minimize function returns a tuple containing two values:
float[] expectedGameValue: An array containing the expected payoffs for each player at the Nash equilibrium. If the game is zero-sum, these payoffs will sum to zero.StrategyProfile strategyProfile: A collection of strategies for the game states visited while running CFR. To access these strategies from C#, use theIDictionary<string, float[]> ToDictmember. The keys of this dictionary correspond toGameState.Key, 64B6 while the values are arrays representing the probability of taking each of theGameState.LegalActionsat that game state. This profile can be used to play the game according to the strategy found by CFR.
After minimizing regret, the resulting strategy profile can be saved to disk via its Save method. In order to play a game with a saved profile, first load it from disk using the static StrategyProfile.Load method. To play a mixed strategy (e.g. in poker), call the Sample method with a key that corresponds to the state of the game from the current player's point of view. To play a pure strategy, call the Best method, which always chooses the action with the highest probability in a given situation.
Cfrm is written in F# and supports F# implementations smoothly. The important differences from working in C# are:
- When inheriting from
GameState, you can overrideTerminalValuesOptinstead ofTerminalValuesin order to avoid returningnull. - You can call function
CounterFactualRegret.minimizeinstead of methodCounterFactualRegret.Minimize. This allows you to pass in an F# closure for thegetInitialStatecallback instead of the wrappedFuncthat is passed toMinimize.
There are working examples of Kuhn poker for both C# and F# in the unit test projects.
- Vanilla Counterfactual Regret Minimization for Engineers: Walkthrough of a Python implementation of 2-player CFR
- An Introduction to Counterfactual Regret Minimization: Detailed overview of CFR with a Java implementation
- Multiplayer CFR: Multiplayer support in Python.