8000 Release GPTQModel v1.7.2 · ModelCloud/GPTQModel · GitHub
[go: up one dir, main page]

Skip to content

GPTQModel v1.7.2

Compare
Choose a tag to compare
@Qubitium Qubitium released this 19 Jan 03:52
· 395 commits to main since this release
d762379

What's Changed

⚡Effective BPW (bits per weight) will now be logged during load().
⚡Reduce loading time on Intel Arc A770/B580 XPU by 3.3x.
⚡Reduce memory usage in MLX conversion.
🐛 Fix Marlin kernel auto-select not checking CUDA compute version.

Full Changelog: v1.7.0...v1.7.2

0