8000 ai-deployment · GitHub Topics · GitHub

#

ai-deployment

Here are 18 public repositories matching this topic...

muxi-ai / muxi

Deploy intelligence. Open-source infrastructure for AI agents in production.

Updated Mar 11, 2026

Davide011 / Thesis_public

Mechanistic analysis of a GPT-2–like model exploring the compositionality gap in transformers. Using Logit Lens and Causal Tracing, the study identifies and overcomes a deep-layer bottleneck via dataset enhancement addressing the stated Compositionality Gap (NeurIPS24).

tokenizer hpc-clusters explainable-ai ai-research explainability compositionality neurips llm logitlens mechanistic-interpretability llm-training causal-tracing gpt-2-model ai-deployment neurips-paper

Updated Oct 29, 2025
Python

jkfran / azure-openai-deployment-mock

A mock Azure OpenAI API for seamless testing and development, supporting both streaming and non-streaming responses. Easily emulate OpenAI completions with token-based streaming in a local or Dockerized environment.

azure openai streaming-api api-testing local-development mock-api fastapi azure-openai-api ai-deployment

Updated Nov 3, 2024
Python

dilettacal / digital-twin

My Digital Twin Companion

aws typescript amazon reactjs openai chatbots pythnon llms ai-deployment

Updated Mar 1, 2026
Python

bossnameless308wp / DeepSeek-V3-Windows-Local-Fix

DeepSeek-V3 Windows Deployment Fixer: Resolves CUDA_ERROR_OUT_OF_MEMORY, missing cublas64_12.dll, and Triton compiler errors. Optimized for RTX 30/40 series GPUs.

python machine-learning pytorch nvidia-gpu windows-fix ollama llm-optimization ai-deployment deepseek-v3 dll-repair deepseek-r1 cuda-fix

Updated Feb 9, 2026

capetron / private-ai-deployment-guide

Complete guide to deploying private, on-premise AI and LLMs: hardware selection, model comparison (ollama vs vLLM vs llama.cpp), security hardening, and AI governance policy templates. By Petronella Technology Group.

machine-learning gpu self-hosted ai-security llm ai-governance llama-cpp vllm ollama private-ai ai-deployment on-premise-ai

Updated Mar 6, 2026
Shell

Fahadfk / AI_deployment

Comprehensive guide to FastAPI, Pydantic, and SQLAlchemy for AI engineers. Learn API design, validation, and ORM workflows with practical examples and setup 🐙

python machine-learning metal hpc openai maas creative rag huggingface ai-agent llm generative-ai chatgpt llm-inference local-ai llama3 ai-deployment heterogeneous-cluster

Updated Mar 12, 2026

geek2geeks / gpu-inference-server

Production-oriented GPU inference stack with FastAPI, Docker, Redis, Prometheus, and Grafana for AI workload serving

docker mlops fastapi gpu-inference ai-deployment

Updated Mar 11, 2026
Python

PurnabrataPanja / Spam_Classifier

A Streamlit-based spam classifier that predicts whether a message is spam or not spam using machine learning.

python data-science machine-learning natural-language-processing text-classification cyber-security spam-detection nlp-model streamlit ai-deployment

Updated Sep 4, 2024
Jupyter Notebook

FlosMume / CUDA-AI-Inference-Starter

A minimal, high-performance starter kit for running AI model inference on NVIDIA GPUs using CUDA. Includes environment setup, sample kernels, and guidance for integrating ONNX/TensorRT pipelines for fast, optimized inference on modern GPU hardware.

benchmarking deep-learning cuda nvidia high-performance-computing fp16 int8 inference-engine onnx tensor-rt devtech model-optimization gpu-inference ai-deployment adge-ai

Updated Nov 2, 2025
Cuda

RavinduLayanga / mixcap-web-platform

Full-stack web application (React + Flask) for Multimodal Video Captioning. Deploys the MixCap model (BLIP-2 + Wav2Vec2) to generate video descriptions for end-users.

react flask rest-api pytorch full-stack video-captioning ai-deployment mixcap

Updated Jan 21, 2026
JavaScript

AkshaySyal / edge-ai-model-deployment

End-to-end pipeline for deploying deep learning models on edge devices: model conversion, quantization, hardware acceleration, and Android integration.

python computer-vision deep-learning tensorflow pytorch edge-ai model-quantization ai-deployment

Updated Mar 9, 2026
Jupyter Notebook

NathanMaine / ai-deployment-experience-bot

Skeleton Slack bot that lets developers trigger deployments, rollbacks, and status checks via simple chat commands.

python chatbot agentic-ai ai-deployment

Updated Dec 15, 2025
Python

alhemdrew / self-hosted-llm-infrastructure

Deployment of a self-hosted LLM infrastructure using Ollama and Open WebUI on Linux, including custom model creation, API integration, and system-level troubleshooting.

machine-learning llm prompt-engineering api-intergration local-llm ollama openwebui ai-deployment self-hosted-ai

Updated Feb 14, 2026

maihoangbichtram / multi_agentic_rag

Multi-agentic researcher (RAG)

ai deployment terraform google-cloud multiagent google-cloud-platform rag tavily agentic-ai ai-deployment

Updated Mar 21, 2025
Python

mairwunnx-infra / zeroclaw

🧠 Инфраструктура для деплоймента Zeroclaw, докерезированная и совместимая с Portainer

docker docker-compose portainer llm-deployment ai-deployment openclaw zeroclaw openclaw-docker zeroclaw-docker

Updated Mar 6, 2026
Dockerfile

thomasbrnal / DeepSeek-V3-Windows-Local-Fix

🛠 Fix DeepSeek-V3 Windows setup issues by resolving CUDA memory errors, missing DLLs, and PyTorch-CUDA conflicts for smooth local deployment.

python machine-learning pytorch windows-fix ollama llm-optimization ai-deployment deepseek-v3 dll-repair deepseek-r1 cuda-fix

Updated Feb 23, 2026

Dartayous / triton-latency-benchmark

Compare PyTorch vs Triton inference latency with CLI tools, benchmarks, and performance plots.

visualization benchmarking deep-learning inference triton performance-testing ai-deployment

Updated Jul 22, 2025
Python

Improve this page

Add a description, image, and links to the ai-deployment topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the ai-deployment topic, visit your repo's landing page and select "manage topics."

0