Stars
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…
🔊 Text-Prompted Generative Audio Model
Enhance Tesseract OCR output for scanned PDFs by applying Large Language Model (LLM) corrections.
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 and reasoning techniques.
AI Audio Datasets (AI-ADS) 🎵, including Speech, Music, and Sound Effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio a…
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
Official inference repo for FLUX.1 models
You like pytorch? You like micrograd? You love tinygrad! ❤️
An open-source remote desktop application designed for self-hosting, as an alternative to TeamViewer.
Perplexica is an AI-powered search engine. It is an Open source alternative to Perplexity AI
Stable Diffusion web UI
A generative speech model for daily dialogue.
fabric is an open-source framework for augmenting humans using AI. It provides a modular framework for solving specific problems using a crowdsourced set of AI prompts that can be used anywhere.
CodiumAI Cover-Agent: An AI-Powered Tool for Automated Test Generation and Code Coverage Enhancement! 💻🤖🧪🐞
VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)
Evaluate the accuracy of LLM generated outputs
State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.
Foundational Models for State-of-the-Art Speech and Text Translation
rga: ripgrep, but also search in PDFs, E-Books, Office documents, zip, tar.gz, etc.
Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a ca…
FinGPT: Open-Source Financial Large Language Models! Revolutionize 🔥 We release the trained model on HuggingFace.