[go: up one dir, main page]

Skip to content

Built upon Megatron-Deepspeed and HuggingFace Trainer, EasyLLM has reorganized the code logic with a focus on usability. While enhancing usability, it also ensures training efficiency.

License

Notifications You must be signed in to change notification settings

ModelTC/EasyLLM

Repository files navigation

EasyLLM

Built upon Megatron-Deepspeed and HuggingFace Trainer, EasyLLM has reorganized the code logic with a focus on usability. While enhancing usability, it also ensures training efficiency.

Install

  • Install python requirements

    pip install -r requirements.txt

    other dependency

    • flash-attn (dropout_layer_norm) (maybe you need to compile it by yourself)
  • Pull deepspeed & add them to pythonpath

    export PYTHONPATH=/path/to/DeepSpeed:$PYTHONPATH
  • Install package in development mode

    pip install -e . -v

Train

Train Example

Infer and Eval

Infer Example

Support Models

  • qwen14b,
  • internlm7-20b,
  • baichuan1/2 (7b-13b)
  • llama1-2 (7b/13b/70b)

Model Example

Data

Data Example

3D Parallel config setting

Parallel Example

Speed Benchmark

Speed Benchmark

Dynamic Checkpoint

To optimize the model training performance in terms of time and space, EasyLLM supports Dynamic Checkpoint. Based on the input token size, it enables checkpointing for some layers. The configuration file settings are as follows:

Dynamic Checkpoint Example

License

This repository is released under the Apache-2.0 license.

Acknowledgement

We learned a lot from the following projects when developing EasyLLM.

About

Built upon Megatron-Deepspeed and HuggingFace Trainer, EasyLLM has reorganized the code logic with a focus on usability. While enhancing usability, it also ensures training efficiency.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages