You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix(docs): remove --branch flag from git clone command for bleeding edge setup in Installation.md (#6164)
* fix(docs): remove --branch flag from git clone command for bleeding edge setup in Installation.md
The `--branch` flag was causing a fatal error during the cloning process:
"fatal: You must specify a repository to clone."
* docs: correct config setting name from `conditioning_type` to `goal_conditioning_type`
Updated the documentation to reflect the correct config setting name `goal_conditioning_type` instead of `conditioning_type` in the config YAML. This ensures accuracy and prevents potential confusion for users following the documentation.
-[Memory-enhanced Agents using Recurrent Neural Networks](#memory-enhanced-agents-using-recurrent-neural-networks)
18
+
-[Self-Play](#self-play)
19
+
-[Note on Reward Signals](#note-on-reward-signals)
20
+
-[Note on Swap Steps](#note-on-swap-steps)
20
21
21
22
## Common Trainer Configurations
22
23
@@ -44,7 +45,7 @@ choice of the trainer (which we review on subsequent sections).
44
45
|`network_settings -> num_layers`| (default = `2`) The number of hidden layers in the neural network. Corresponds to how many hidden layers are present after the observation input, or after the CNN encoding of the visual observation. For simple problems, fewer layers are likely to train faster and more efficiently. More layers may be necessary for more complex control problems. <br><br> Typical range: `1` - `3`|
45
46
|`network_settings -> normalize`| (default = `false`) Whether normalization is applied to the vector observation inputs. This normalization is based on the running average and variance of the vector observation. Normalization can be helpful in cases with complex continuous control problems, but may be harmful with simpler discrete control problems. |
46
47
| `network_settings -> vis_encode_type` | (default = `simple`) Encoder type for encoding visual observations. <br><br> `simple` (default) uses a simple encoder which consists of two convolutional layers, `nature_cnn` uses the CNN implementation proposed by [Mnih et al.](https://www.nature.com/articles/nature14236), consisting of three convolutional layers, and `resnet` uses the [IMPALA Resnet](https://arxiv.org/abs/1802.01561) consisting of three stacked layers, each with two residual blocks, making a much larger network than the other two. `match3` is a smaller CNN ([Gudmundsoon et al.](https://www.researchgate.net/publication/328307928_Human-Like_Playtesting_with_Deep_Learning)) that can capture more granular spatial relationships and is optimized for board games. `fully_connected` uses a single fully connected dense layer as encoder without any convolutional layers. <br><br> Due to the size of convolution kernel, there is a minimum observation size limitation that each encoder type can handle - `simple`: 20x20, `nature_cnn`: 36x36, `resnet`: 15 x 15, `match3`: 5x5. `fully_connected` doesn't have convolutional layers and thus no size limits, but since it has less representation power it should be reserved for very small inputs. Note that using the `match3` CNN with very large visual input might result in a huge observation encoding and thus potentially slow down training or cause memory issues. |
47
-
| `network_settings -> conditioning_type` | (default = `hyper`) Conditioning type for the policy using goal observations. <br><br> `none` treats the goal observations as regular observations, `hyper` (default) uses a HyperNetwork with goal observations as input to generate some of the weights of the policy. Note that when using `hyper` the number of parameters of the network increases greatly. Therefore, it is recommended to reduce the number of `hidden_units` when using this `conditioning_type`
48
+
| `network_settings -> goal_conditioning_type` | (default = `hyper`) Conditioning type for the policy using goal observations. <br><br> `none` treats the goal observations as regular observations, `hyper` (default) uses a HyperNetwork with goal observations as input to generate some of the weights of the policy. Note that when using `hyper` the number of parameters of the network increases greatly. Therefore, it is recommended to reduce the number of `hidden_units` when using this `goal_conditioning_type`
0 commit comments