Applying AlphaZero Self-Play Tactics to LLaMA for Enhanced Chatbot Interaction
-
Updated
Jan 5, 2024 - Python
Applying AlphaZero Self-Play Tactics to LLaMA for Enhanced Chatbot Interaction
Simple Muesli RL algorithm implementation (PyTorch)
A set of experiments and human-playing comparisons with the Muzero agent from Google DeepMind, made as part of a research project with l'école polytechnique.
GenesisZERO : potential applications for MCTS agents with LLMs for Sequential decision-making
muzero Algorithm Reinforcement Learning for Chinese XiangQi
Meta-learning experiments for the game of minichess and related rule variants.
Trains a deep reinforcement learning agent in simulation testbed environments with the DRLA library.
Trains deep reinforcement learning agents in Atari environments via the DRLA library.
MuZero for Super Mario Bros
A Notebook implementation of the Pseudocode from the original Muzero paper
Deep Q Learning blackbox strategies for casino games
Materials for AlphaGo
A robust variant of MuZero
C++ Deep Reinforcement Learning Agent library
Add a description, image, and links to the muzero topic page so that developers can more easily learn about it.
To associate your repository with the muzero topic, visit your repo's landing page and select "manage topics."