Open
Description
I want to generate somewhere around 1 trillion tokens and I was thinking of using TGI on a European Supercomputer. is there a way to achieve this without relying on docker and downloading the model natively and then load it on the compute node and serve it? @Wauplin
Metadata
Metadata
Assignees
Labels
No labels