mixture-of-experts

Here are 275 public repositories matching this topic...

deepspeedai / DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

machine-learning compression deep-learning gpu inference pytorch zero data-parallelism model-parallelism mixture-of-experts pipeline-parallelism billion-parameters trillion-parameters

Updated Jan 28, 2026
Python

algorithmicsuperintelligence / optillm

Star

Optimizing inference proxy for LLMs

Updated Jan 28, 2026
Python

learning-at-home / hivemind

Star

Decentralized deep learning in PyTorch. Built to train models on thousands of volunteers across the world.

distributed-systems machine-learning deep-learning pytorch dht neural-networks asyncio asynchronous-programming volunteer-computing hivemind distributed-training mixture-of-experts

Updated Jan 11, 2026
Python

dvmazur / mixtral-offloading

Star

Run Mixtral-8x7B models in Colab or consumer desktops

deep-learning pytorch offloading quantization language-model google-colab colab-notebook mixture-of-experts llm

Updated Apr 8, 2024
Python

PKU-YuanGroup / MoE-LLaVA

Star

【TMM 2025🔥】 Mixture-of-Experts for Large Vision-Language Models

moe multi-modal mixture-of-experts large-vision-language-model

Updated Jul 15, 2025
Python

davidmrau / mixture-of-experts

Star

PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al. https://arxiv.org/abs/1701.06538

pytorch moe re-implementation mixture-of-experts sparsely-gated-mixture-of-experts

Updated Apr 19, 2024
Python

rhymes-ai / Aria

Star

Codebase for Aria - an Open Multimodal Native MoE

multimodal vision-and-language mixture-of-experts

Updated Jan 22, 2025
Jupyter Notebook

pjlab-sys4nlp / llama-moe

Star

⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)

moe llama mixture-of-experts llm continual-pre-training expert-partition

Updated Dec 6, 2024
Python

microsoft / Tutel

Star

Tutel MoE: Optimized Mixture-of-Experts Library, Support GptOss/DeepSeek/Kimi-K2/Qwen3 using FP8/NVFP4/MXFP4

pytorch moe mixture-of-experts llm deepseek

Updated Dec 21, 2025
C

SMTorg / smt

Star

Surrogate Modeling Toolbox

machine-learning derivative sampling predictive-modeling surrogate-models mixture-of-experts multi-fidelity

Updated Jan 12, 2026
Jupyter Notebook

lucidrains / mixture-of-experts

Star

A Pytorch implementation of Sparsely-Gated Mixture of Experts, for massively increasing the parameter count of language models

deep-learning artificial-intelligence transformer mixture-of-experts

Updated Sep 13, 2023
Python

AviSoori1x / makeMoE

Star

From scratch implementation of a sparse mixture of experts language model inspired by Andrej Karpathy's makemore :)

deep-learning pytorch neural-networks pytorch-implementation mixture-of-experts large-language-models llm

Updated Oct 30, 2024
Jupyter Notebook

drawbridge / keras-mmoe

Star

A TensorFlow Keras implementation of "Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts" (KDD 2018)

data-science machine-learning deep-neural-networks deep-learning tensorflow keras multi-task-learning kdd2018 mixture-of-experts

Updated Mar 25, 2023
Python

ymcui / Chinese-Mixtral

Star

中文Mixtral混合专家大模型（Chinese Mixtral MoE LLMs）

nlp moe 64k mixture-of-experts 32k large-language-models llm mixtral

Updated Apr 30, 2024
Python

Leeroo-AI / mergoo

Star

A library for easily merging multiple LLM experts, and efficiently train the merged LLM.

nlp open-source transformers merge artificial-intelligence multi-model lora fine-tuning mixture-of-experts large-language-models llm generative-ai mixture-of-adapters

Updated Aug 26, 2024
Python

weigao266 / Awesome-Efficient-Arch

Star

Speed Always Wins: A Survey on Efficient Architectures for Large Language Models

moe mamba linear-models state-space-model mixture-of-experts efficient-architectures linear-attention sparse-attention linear-rnn diffusion-llm

Updated Nov 11, 2025

lucidrains / st-moe-pytorch

Star

Implementation of ST-Moe, the latest incarnation of MoE after years of research at Brain, in Pytorch

deep-learning artificial-intelligence mixture-of-experts conditional-computation

Updated Jun 17, 2024
Python

lucidrains / soft-moe-pytorch

Star

Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch

deep-learning transformers artificial-intelligence mixture-of-experts

Updated Apr 2, 2025
Python

SkyworkAI / MoH

Star

MoH: Multi-Head Attention as Mixture-of-Head Attention

transformer moe attention vit dit mixture-of-experts llms

Updated Oct 29, 2024
Python

EfficientMoE / MoE-Infinity

Star

PyTorch library for cost-effective, fast and easy serving of MoE models.

pytorch inference-engine mixture-of-experts huggingface large-language-models llm-inference

Updated Oct 15, 2025
Python

Improve this page

Add a description, image, and links to the mixture-of-experts topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the mixture-of-experts topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mixture-of-experts

Here are 275 public repositories matching this topic...

deepspeedai / DeepSpeed

algorithmicsuperintelligence / optillm

learning-at-home / hivemind

dvmazur / mixtral-offloading

PKU-YuanGroup / MoE-LLaVA

davidmrau / mixture-of-experts

rhymes-ai / Aria

pjlab-sys4nlp / llama-moe

microsoft / Tutel

SMTorg / smt

lucidrains / mixture-of-experts

AviSoori1x / makeMoE

drawbridge / keras-mmoe

ymcui / Chinese-Mixtral

Leeroo-AI / mergoo

weigao266 / Awesome-Efficient-Arch

lucidrains / st-moe-pytorch

lucidrains / soft-moe-pytorch

SkyworkAI / MoH

EfficientMoE / MoE-Infinity

Improve this page

Add this topic to your repo