8000 Add `oumi[quantization]` optional dependency (#1902) · oumi-ai/oumi@32393df · GitHub
[go: up one dir, main page]

Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Commit 32393df

Browse files
authored
Add oumi[quantization] optional dependency (#1902)
1 parent 48131d7 commit 32393df

File tree

6 files changed

+28
-14
lines changed

6 files changed

+28
-14
lines changed

configs/examples/quantization/README.md

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,8 @@
44
55
This directory contains example configurations for model quantization using Oumi's AWQ and BitsAndBytes quantization methods.
66

7+
> **NOTE**: Quantization requires a GPU to run.
8+
79
## Configuration Files
810

911
- **`awq_quantization_config.yaml`** - AWQ 4-bit quantization with calibration
@@ -15,7 +17,7 @@ This directory contains example configurations for model quantization using Oumi
1517
# Simplest command-line usage
1618
oumi quantize --method awq_q4_0 --model "TinyLlama/TinyLlama-1.1B-Chat-v1.0" --output quantized_model
1719

18-
# Using configuration file (requires GPU)
20+
# Using configuration file
1921
oumi quantize --config configs/examples/quantization/awq_quantization_config.yaml
2022
```
2123

@@ -40,10 +42,12 @@ oumi quantize --config configs/examples/quantization/awq_quantization_config.yam
4042
## Requirements
4143

4244
```bash
43-
# For AWQ quantization
45+
pip install oumi[quantization]
46+
47+
# Alternatively, for AWQ quantization only
4448
pip install autoawq
4549

46-
# For BitsAndBytes quantization
50+
# Alternatively, for BitsAndBytes quantization only
4751
pip install bitsandbytes
4852
```
4953

docs/user_guides/quantization.md

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,8 @@
44
55
This guide covers the `oumi quantize` command for reducing model size while maintaining performance.
66

7+
> **NOTE**: Quantization requires a GPU to run.
8+
79
## Quick Start
810

911
```bash
@@ -93,10 +95,12 @@ Currently supported output formats:
9395
## Installation
9496

9597
```bash
96-
# For AWQ quantization
98+
pip install oumi[quantization]
99+
100+
# Alternatively, for AWQ quantization only
97101
pip install autoawq
98102

99-
# For BitsAndBytes quantization
103+
# Alternatively, for BitsAndBytes quantization only
100104
pip install bitsandbytes
101105
```
102106

notebooks/Oumi - A Tour.ipynb

Lines changed: 1 addition & 1 deletion
Original fi 10BC0 le line numberDiff line numberDiff line change
@@ -493,7 +493,7 @@
493493
"name": "python",
494494
"nbconvert_exporter": "python",
495495
"pygments_lexer": "ipython3",
496-
"version": "3.11.8"
496+
"version": "3.11.13"
497497
}
498498
},
499499
"nbformat": 4,

notebooks/Oumi - Quantization Tutorial.ipynb

Lines changed: 11 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -31,13 +31,17 @@
3131
"\n",
3232
"⚠️ **DEVELOPMENT STATUS**: The quantization feature is currently under active development. Some features may change in future releases.\n",
3333
"\n",
34-
"First, let's install Oumi with GPU support and the required quantization libraries:\n",
35-
"\n",
36-
"```bash\n",
37-
"pip install oumi[gpu]\n",
38-
"pip install autoawq\n",
39-
"pip install triton==3.0.0 # Required for AWQ inference compatibility\n",
40-
"```"
34+
"First, let's install Oumi with GPU support and the required quantization libraries:"
35+
]
36+
},
37+
{
38+
"cell_type": "code",
39+
"execution_count": null,
40+
"metadata": {},
41+
"outputs": [],
42+
"source": [
43+
"%pip install oumi[gpu,quantization]\n",
44+
"%pip install triton==3.0.0 # Required for AWQ inference compatibility"
4145
]
4246
},
4347
{

pyproject.toml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -154,6 +154,8 @@ evaluation = [
154154
"sentencepiece>=0.1.98",
155155
]
156156

157+
quantization = ["autoawq>=0.2.0,<0.3", "bitsandbytes>=0.45.0,<0.46"]
158+
157159
bitnet = ["onebitllms>=0.0.3"]
158160

159161
cambrian = [

src/oumi/quantize/awq_quantizer.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,7 @@ def raise_if_requirements_not_met(self):
6060
if self._awq is None:
6161
raise RuntimeError(
6262
"AWQ quantization requires autoawq library.\n"
63-
"Install with: `pip install autoawq`\n"
63+
"Install with: `pip install oumi[quantization]`\n"
6464
)
6565

6666
if not torch.cuda.is_available():

0 commit comments

Comments
 (0)
0