[go: up one dir, main page]

Skip to content
View HuangOwen's full-sized avatar
🏠
Working from home
🏠
Working from home

Block or report HuangOwen

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Beta Lists are currently in beta. Share feedback and report bugs.

Starred repositories

Showing results

VPTQ, A Flexible and Extreme low-bit quantization algorithm

Python 121 5 Updated Oct 4, 2024

State-of-the-art Parameter-Efficient MoE Fine-tuning Method

Python 68 8 Updated Aug 22, 2024

[NeurIPS 2024 Oral🔥] DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs.

Python 57 3 Updated Oct 3, 2024

Next-Token Prediction is All You Need

Python 799 22 Updated Sep 30, 2024

QQQ is an innovative and hardware-optimized W4A8 quantization solution for LLMs.

Python 63 5 Updated Sep 12, 2024

High-Resolution Image Synthesis with Latent Diffusion Models

Jupyter Notebook 11,594 1,509 Updated Feb 29, 2024

Fast Hadamard transform in CUDA, with a PyTorch interface

C 94 14 Updated May 24, 2024

[EMNLP 2024] RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization

Python 19 2 Updated Sep 24, 2024

[ACL 2024] A novel QAT with Self-Distillation framework to enhance ultra low-bit LLMs.

Python 75 8 Updated May 16, 2024

Official Code for "SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression"

Python 87 7 Updated Oct 1, 2024
Python 13 Updated Sep 4, 2024

A family of compressed models obtained via pruning and knowledge distillation

259 16 Updated Oct 2, 2024

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 and reasoning techniques.

4,006 218 Updated Oct 4, 2024

Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference

Python 4,304 224 Updated Jun 14, 2024

(ෆ`꒳´ෆ) A Survey on Text-to-Image Generation/Synthesis.

2,102 187 Updated Aug 20, 2024

[ECCV 2024] Efficient Diffusion Transformer with Step-wise Dynamic Attention Mediators

Python 28 Updated Sep 11, 2024

Get up and running with Llama 3.2, Mistral, Gemma 2, and other large language models.

Go 91,979 7,241 Updated Oct 3, 2024

Official repo for consistency models.

Python 6,079 411 Updated Mar 22, 2024

Open-MAGVIT2: Democratizing Autoregressive Visual Generation

Python 631 24 Updated Sep 27, 2024
Python 336 34 Updated Sep 23, 2024

Consistency Models Made Easy

Python 197 7 Updated Sep 23, 2024

The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.

Python 52,322 5,515 Updated Oct 4, 2024

Awesome LLMs on Device: A Comprehensive Survey

777 108 Updated Sep 24, 2024

code for "Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion"

Python 520 20 Updated Sep 26, 2024

Official PyTorch and Diffusers Implementation of "LinFusion: 1 GPU, 1 Minute, 16K Image"

Python 212 14 Updated Oct 2, 2024

Kolors Team

Python 3,663 242 Updated Sep 4, 2024

A paper list about diffusion models for natural language processing.

172 6 Updated Aug 28, 2023

FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

Python 575 45 Updated Sep 4, 2024

Boosting 4-bit inference kernels with 2:4 Sparsity

Cuda 47 2 Updated Sep 4, 2024

[ACL'24 Outstanding] Data and code for L-Eval, a comprehensive long context language models evaluation benchmark

Python 349 14 Updated Jul 9, 2024
Next