InternLM
diff --git a/‎docs/en/get_started/ascend/get_started.md
Lines changed: 10 additions & 0 deletions b/‎docs/en/get_started/ascend/get_started.md
Lines changed: 10 additions & 0 deletions
diff --git a/‎docs/en/supported_models/supported_models.md
Lines changed: 20 additions & 17 deletions b/‎docs/en/supported_models/supported_models.md
Lines changed: 20 additions & 17 deletions
diff --git a/‎docs/zh_cn/get_started/ascend/get_started.md
Lines changed: 10 additions & 0 deletions b/‎docs/zh_cn/get_started/ascend/get_started.md
Lines changed: 10 additions & 0 deletions
diff --git a/‎docs/zh_cn/supported_models/supported_models.md
Lines changed: 20 additions & 16 deletions b/‎docs/zh_cn/supported_models/supported_models.md
Lines changed: 20 additions & 16 deletions
@@ -158,6 +158,16 @@ lmdeploy lite auto_awq $HF_MODEL --work-dir $WORK_DIR --device npu
 
 Please check [supported_models](../../supported_models/supported_models.md) before use this feature.
 
+### w8a8 SMOOTH_QUANT
+
+Run the following commands to quantize weights on Atlas 800T A2.
+
+```bash
+lmdeploy lite smooth_quant $HF_MODEL --work-dir $WORK_DIR --device npu
+```
+
+Please check [supported_models](../../supported_models/supported_models.md) before use this feature.
+
 ### int8 KV-cache Quantization
 
 Ascend backend has supported offline int8 KV-cache Quantization on eager mode.
 
@@ -115,20 +115,23 @@ The following tables detail the models supported by LMDeploy's TurboMind engine
 
 ## PyTorchEngine on Huawei Ascend Platform
 
-|     Model      |   Size   | Type | FP16/BF16(eager) | FP16/BF16(graph) | W4A16(eager) |
-| :------------: | :------: | :--: | :--------------: | :--------------: | :----------: |
-|     Llama2     | 7B - 70B | LLM  |       Yes        |       Yes        |     Yes      |
-|     Llama3     |    8B    | LLM  |       Yes        |       Yes        |     Yes      |
-|    Llama3.1    |    8B    | LLM  |       Yes        |       Yes        |     Yes      |
-|   InternLM2    | 7B - 20B | LLM  |       Yes        |       Yes        |     Yes      |
-|  InternLM2.5   | 7B - 20B | LLM  |       Yes        |       Yes        |     Yes      |
-|   InternLM3    |    8B    | LLM  |       Yes        |       Yes        |     Yes      |
-|    Mixtral     |   8x7B   | LLM  |       Yes        |       Yes        |      No      |
-|  QWen1.5-MoE   |  A2.7B   | LLM  |       Yes        |        -         |      No      |
-|   QWen2(.5)    |    7B    | LLM  |       Yes        |       Yes        |      No      |
-|   QWen2-MoE    | A14.57B  | LLM  |       Yes        |        -         |      No      |
-|  DeepSeek-V2   |   16B    | LLM  |        No        |       Yes        |      No      |
-| InternVL(v1.5) |  2B-26B  | MLLM |       Yes        |        -         |     Yes      |
-|   InternVL2    |  1B-40B  | MLLM |       Yes        |       Yes        |     Yes      |
-|  CogVLM2-chat  |   19B    | MLLM |       Yes        |        No        |      -       |
-|     GLM4V      |    9B    | MLLM |       Yes        |        No        |      -       |
+|     Model      |   Size   | Type | FP16/BF16(eager) | FP16/BF16(graph) | W8A8(graph) | W4A16(eager) |
+| :------------: | :------: | :--: | :--------------: | :--------------: | :---------: | :----------: |
+|     Llama2     | 7B - 70B | LLM  |       Yes        |       Yes        |     Yes     |     Yes      |
+|     Llama3     |    8B    | LLM  |       Yes        |       Yes        |     Yes     |     Yes      |
+|    Llama3.1    |    8B    | LLM  |       Yes        |       Yes        |     Yes     |     Yes      |
+|   InternLM2    | 7B - 20B | LLM  |       Yes        |       Yes        |     Yes     |     Yes      |
+|  InternLM2.5   | 7B - 20B | LLM  |       Yes        |       Yes        |     Yes     |     Yes      |
+|   InternLM3    |    8B    | LLM  |       Yes        |       Yes        |     Yes     |     Yes      |
+|    Mixtral     |   8x7B   | LLM  |       Yes        |       Yes        |     No      |      No      |
+|  QWen1.5-MoE   |  A2.7B   | LLM  |       Yes        |        -         |     No      |      No      |
+|   QWen2(.5)    |    7B    | LLM  |       Yes        |       Yes        |     Yes     |     Yes      |
+|    QWen2-VL    |  2B, 7B  | MLLM |       Yes        |       Yes        |      -      |      -       |
+|   QWen2.5-VL   | 3B - 72B | MLLM |       Yes        |       Yes        |      -      |      -       |
+|   QWen2-MoE    | A14.57B  | LLM  |       Yes        |        -         |     No      |      No      |
+|  DeepSeek-V2   |   16B    | LLM  |        No        |       Yes        |     No      |      No      |
+| InternVL(v1.5) |  2B-26B  | MLLM |       Yes        |        -         |     Yes     |     Yes      |
+|   InternVL2    |  1B-40B  | MLLM |       Yes        |       Yes        |     Yes     |     Yes      |
+|  InternVL2.5   |  1B-78B  | MLLM |       Yes        |       Yes        |     Yes     |     Yes      |
+|  CogVLM2-chat  |   19B    | MLLM |       Yes        |        No        |      -      |      -       |
+|     GLM4V      |    9B    | MLLM |       Yes        |        No        |      -      |      -       |
@@ -154,6 +154,16 @@ lmdeploy lite auto_awq $HF_MODEL --work-dir $WORK_DIR --device npu
 
 支持的模型列表请参考[支持的模型](../../supported_models/supported_models.md)。
 
+### w8a8 SMOOTH_QUANT
+
+运行下面的代码可以在Atlas 800T A2上对权重进行W8A8量化。
+
+```bash
+lmdeploy lite smooth_quant $HF_MODEL --work-dir $WORK_DIR --device npu
+```
+
+支持的模型列表请参考[支持的模型](../../supported_models/supported_models.md)。
+
 ### int8 KV-cache 量化
 
 昇腾后端现在支持了在eager模式下的离线int8 KV-cache量化。
 
@@ -115,19 +115,23 @@
 
 ## PyTorchEngine 华为昇腾平台
 
-|     Model      |   Size   | Type | FP16/BF16(eager) | FP16/BF16(graph) | W4A16(eager) |
-| :------------: | :------: | :--: | :--------------: | :--------------: | :----------: |
-|     Llama2     | 7B - 70B | LLM  |       Yes        |       Yes        |     Yes      |
-|     Llama3     |    8B    | LLM  |       Yes        |       Yes        |     Yes      |
-|    Llama3.1    |    8B    | LLM  |       Yes        |       Yes        |     Yes      |
-|   InternLM2    | 7B - 20B | LLM  |       Yes        |       Yes        |     Yes      |
-|  InternLM2.5   | 7B - 20B | LLM  |       Yes        |       Yes        |     Yes      |
-|    Mixtral     |   8x7B   | LLM  |       Yes        |       Yes        |      No      |
-|  QWen1.5-MoE   |  A2.7B   | LLM  |       Yes        |        -         |      No      |
-|   QWen2(.5)    |    7B    | LLM  |       Yes        |       Yes        |      No      |
-|   QWen2-MoE    | A14.57B  | LLM  |       Yes        |        -         |      No      |
-|  DeepSeek-V2   |   16B    | LLM  |        No        |       Yes        |      No      |
-| InternVL(v1.5) |  2B-26B  | MLLM |       Yes        |        -         |     Yes      |
-|   InternVL2    |  1B-40B  | MLLM |       Yes        |       Yes        |     Yes      |
-|  CogVLM2-chat  |   19B    | MLLM |       Yes        |        No        |      -       |
-|     GLM4V      |    9B    | MLLM |       Yes        |        No        |      -       |
+|     Model      |   Size   | Type | FP16/BF16(eager) | FP16/BF16(graph) | W8A8(graph) | W4A16(eager) |
+| :------------: | :------: | :--: | :--------------: | :--------------: | :---------: | :----------: |
+|     Llama2     | 7B - 70B | LLM  |       Yes        |       Yes        |     Yes     |     Yes      |
+|     Llama3     |    8B    | LLM  |       Yes        |       Yes        |     Yes     |     Yes      |
+|    Llama3.1    |    8B    | LLM  |       Yes        |       Yes        |     Yes     |     Yes      |
+|   InternLM2    | 7B - 20B | LLM  |       Yes        |       Yes        |     Yes     |     Yes      |
+|  InternLM2.5   | 7B - 20B | LLM  |       Yes        |       Yes        |     Yes     |     Yes      |
+|   InternLM3    |    8B    | LLM  |       Yes        |       Yes        |     Yes     |     Yes      |
+|    Mixtral     |   8x7B   | LLM  |       Yes        |       Yes        |     No      |      No      |
+|  QWen1.5-MoE   |  A2.7B   | LLM  |       Yes        |        -         |     No      |      No      |
+|   QWen2(.5)    |    7B    | LLM  |       Yes        |       Yes        |     Yes     |     Yes      |
+|    QWen2-VL    |  2B, 7B  | MLLM |       Yes        |       Yes        |      -      |      -       |
+|   QWen2.5-VL   | 3B - 72B | MLLM |       Yes        |       Yes        |      -      |      -       |
+|   QWen2-MoE    | A14.57B  | LLM  |       Yes        |        -         |     No      |      No      |
+|  DeepSeek-V2   |   16B    | LLM  |        No        |       Yes        |     No      |      No      |
+| InternVL(v1.5) |  2B-26B  | MLLM |       Yes        |        -         |     Yes     |     Yes      |
+|   InternVL2    |  1B-40B  | MLLM |       Yes        |       Yes        |     Yes     |     Yes      |
+|  InternVL2.5   |  1B-78B  | MLLM |       Yes        |       Yes        |     Yes     |     Yes      |
+|  CogVLM2-chat  |   19B    | MLLM |       Yes        |        No        |      -      |      -       |
+|     GLM4V      |    9B    | MLLM |       Yes        |        No        |      -      |      -       |