You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Can someone explain, why I get different performance, when I apply torch.quantization.quantize_dynamic and torchao.quantize_?
More specifically, I have an LSTM model with two fully connected layers (in the front and in the back). In order to quantize it with torchao, I reimplemented a lstm layer (checked that it works as a nn.LSTM implementation)
Then I compare DynamicInt8ActivationInt8Weight quantization in both libraries:
I trained LSTM to predict y=sin(x) function (R->R). Compared the metrics of the quality: MSE and MAE
Baseline solution (no quantization):
MSE: 0.00035 MAE: 0.01150
Quantization of the trained model using torch.quantization:
MSE: 0.00047 MAE: 0.01554
Quantization of the trained model using torchao (runned on cpu/gpu):
MSE: 0.00037 MAE: 0.01223
torch.quantization and torchao do the same quantization, but metrics have changed (torch.quantization is worse by 25%)
I received an useful answer from torchao team developer:
Hi everyone!
Can someone explain, why I get different performance, when I apply torch.quantization.quantize_dynamic and torchao.quantize_?
More specifically, I have an LSTM model with two fully connected layers (in the front and in the back). In order to quantize it with torchao, I reimplemented a lstm layer (checked that it works as a nn.LSTM implementation)
Then I compare DynamicInt8ActivationInt8Weight quantization in both libraries:
quantize_(model, Int8DynamicActivationInt8WeightConfig())
model = torch.quantization.quantize_dynamic(
model, {nn.Linear}, dtype=torch.qint8
)
The first torchao solution was tested on GPU (NVIDIA A100 80GB PCIe, not MI300), nvcc version 12.1, cudnn 9.8, torch 2.5.1
Metric value drops by 1%
But when I run the second solution (on CPU, as GPU is not yet supported for torch.quantization), metric value drops by 35%.
what could be possibly wrong?
cc @jerryzh168 @jianyuh @raghuramank100 @jamesr66a @vkuzo @jgong5 @Xia-Weiwen @leslie-fang-intel @msaroufim
The text was updated successfully, but these errors were encountered: