[go: up one dir, main page]

Skip to content

We show that we can achieve quantization at a dynamic bit-level by doing per-layer quantization.

Notifications You must be signed in to change notification settings

RazvanDu/LayerwiseQuant

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 

Repository files navigation

Layerwise Quantization

We show that we can achieve quantization at a dynamic bit-level by doing per-layer quantization.

The code will be available here in the near future.

The paper is available at: https://arxiv.org/abs/2406.17415 and it is in review for EMNLP 2024.

If you decide to use please consider citing it using:

@misc{dumitru2024layerwisequantizationpragmaticeffective,
      title={Layer-Wise Quantization: A Pragmatic and Effective Method for Quantizing LLMs Beyond Integer Bit-Levels}, 
      author={Razvan-Gabriel Dumitru and Vikas Yadav and Rishabh Maheshwary and Paul-Ioan Clotan and Sathwik Tejaswi Madhusudhan and Mihai Surdeanu},
      year={2024},
      eprint={2406.17415},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2406.17415}, 
}

About

We show that we can achieve quantization at a dynamic bit-level by doing per-layer quantization.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published