8000 Merge branch 'modular-diffusers' into modular-refactor · huggingface/diffusers@ce642e9 · GitHub
[go: up one dir, main page]

Skip to content

Commit ce642e9

Browse files
authored
Merge branch 'modular-diffusers' into modular-refactor
2 parents 6d5beef + 6a509ba commit ce642e9

File tree

406 files changed

+25816
-6736
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

406 files changed

+25816
-6736
lines changed

.github/workflows/nightly_tests.yml

Lines changed: 50 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -180,6 +180,55 @@ jobs:
180180
pip install slack_sdk tabulate
181181
python utils/log_reports.py >> $GITHUB_STEP_SUMMARY
182182
183+
run_torch_compile_tests:
184+
name: PyTorch Compile CUDA tests
185+
186+
runs-on:
187+
group: aws-g4dn-2xlarge
188+
189+
container:
190+
image: diffusers/diffusers-pytorch-compile-cuda
191+
options: --gpus 0 --shm-size "16gb" --ipc host
192+
193+
steps:
194+
- name: Checkout diffusers
195+
uses: actions/checkout@v3
196+
with:
197+
fetch-depth: 2
198+
199+
- name: NVIDIA-SMI
200+
run: |
201+
nvidia-smi
202+
- name: Install dependencies
203+
run: |
204+
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
205+
python -m uv pip install -e [quality,test,training]
206+
- name: Environment
207+
run: |
208+
python utils/print_env.py
209+
- name: Run torch compile tests on GPU
210+
env:
211+
HF_TOKEN: ${{ secrets.DIFFUSERS_HF_HUB_READ_TOKEN }}
212+
RUN_COMPILE: yes
213+
run: |
214+
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile -s -v -k "compile" --make-reports=tests_torch_compile_cuda tests/
215+
- name: Failure short reports
216+
if: ${{ failure() }}
217+
run: cat reports/tests_torch_compile_cuda_failures_short.txt
218+
219+
- name: Test suite reports artifacts
220+
if: ${{ always() }}
221+
uses: actions/upload-artifact@v4
222+
with:
223+
name: torch_compile_test_reports
224+
path: reports
225+
226+
- name: Generate Report and Notify Channel
227+
if: always()
228+
run: |
229+
pip install slack_sdk tabulate
230+
python utils/log_reports.py >> $GITHUB_STEP_SUMMARY
231+
183232
run_big_gpu_torch_tests:
184233
name: Torch tests on big GPU
185234
strategy:
@@ -417,7 +466,7 @@ jobs:
417466
additional_deps: ["peft"]
418467
- backend: "gguf"
419468
test_location: "gguf"
420-
additional_deps: []
469+
additional_deps: ["peft"]
421470
- backend: "torchao"
422471
test_location: "torchao"
423472
additional_deps: []

.github/workflows/release_tests_fast.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -335,7 +335,7 @@ jobs:
335335
- name: Environment
336336
run: |
337337
python utils/print_env.py
338-
- name: Run example tests on GPU
338+
- name: Run torch compile tests on GPU
339339
env:
340340
HF_TOKEN: ${{ secrets.DIFFUSERS_HF_HUB_READ_TOKEN }}
341341
RUN_COMPILE: yes

docker/diffusers-onnxruntime-cpu/Dockerfile

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -28,9 +28,9 @@ ENV PATH="/opt/venv/bin:$PATH"
2828
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
2929
RUN python3 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
3030
python3 -m uv pip install --no-cache-dir \
31-
torch==2.1.2 \
32-
torchvision==0.16.2 \
33-
torchaudio==2.1.2 \
31+
torch \
32+
torchvision \
33+
torchaudio\
3434
onnxruntime \
3535
--extra-index-url https://download.pytorch.org/whl/cpu && \
3636
python3 -m uv pip install --no-cache-dir \

docs/source/en/_toctree.yml

Lines changed: 43 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -175,7 +175,7 @@
175175
title: gguf
176176
- local: quantization/torchao
177177
title: torchao
178-
- local: quantization/quanto
178+
- local: quantization/quanto
179179
title: quanto
180180
title: Quantization Methods
181181
- sections:
@@ -265,19 +265,23 @@
265265
sections:
266266
- local: api/models/overview
267267
title: Overview
268+
- local: api/models/auto_model
269+
title: AutoModel
268270
- sections:
269271
- local: api/models/controlnet
270272
title: ControlNetModel
273+
- local: api/models/controlnet_union
274+
title: ControlNetUnionModel
271275
- local: api/models/controlnet_flux
272276
title: FluxControlNetModel
273277
- local: api/models/controlnet_hunyuandit
274278
title: HunyuanDiT2DControlNetModel
279+
- local: api/models/controlnet_sana
280+
title: SanaControlNetModel
275281
- local: api/models/controlnet_sd3
276282
title: SD3ControlNetModel
277283
- local: api/models/controlnet_sparsectrl
278284
title: SparseControlNetModel
279-
- local: api/models/controlnet_union
280-
title: ControlNetUnionModel
281285
title: ControlNets
282286
- sections:
283287
- local: api/models/allegro_transformer3d
@@ -286,30 +290,32 @@
286290
title: AuraFlowTransformer2DModel
287291
- local: api/models/cogvideox_transformer3d
288292
title: CogVideoXTransformer3DModel
289-
- local: api/models/consisid_transformer3d
290-
title: ConsisIDTransformer3DModel
291293
- local: api/models/cogview3plus_transformer2d
292294
title: CogView3PlusTransformer2DModel
293295
- local: api/models/cogview4_transformer2d
294296
title: CogView4Transformer2DModel
297+
- local: api/models/consisid_transformer3d
298+
title: ConsisIDTransformer3DModel
295299
- local: api/models/dit_transformer2d
296300
title: DiTTransformer2DModel
297301
- local: api/models/easyanimate_transformer3d
298302
title: EasyAnimateTransformer3DModel
299303
- local: api/models/flux_transformer
300304
title: FluxTransformer2DModel
305+
- local: api/models/hidream_image_transformer
306+
title: HiDreamImageTransformer2DModel
301307
- local: api/models/hunyuan_transformer2d
302308
title: HunyuanDiT2DModel
303309
- local: api/models/hunyuan_video_transformer_3d
304310
title: HunyuanVideoTransformer3DModel
305311
- local: api/models/latte_transformer3d
306312
title: LatteTransformer3DModel
307-
- local: api/models/lumina_nextdit2d
308-
title: LuminaNextDiT2DModel
309-
- local: api/models/lumina2_transformer2d
310-
title: Lumina2Transformer2DModel
311313
- local: api/models/ltx_video_transformer3d
312314
title: LTXVideoTransformer3DModel
315+
- local: api/models/lumina2_transformer2d
316+
title: Lumina2Transformer2DModel
317+
- local: api/models/lumina_nextdit2d
318+
title: LuminaNextDiT2DModel
313319
- local: api/models/mochi_transformer3d
314320
title: MochiTransformer3DModel
315321
- local: api/models/omnigen_transformer
@@ -318,10 +324,10 @@
318324
title: PixArtTransformer2DModel
319325
- local: api/models/prior_transformer
320326
title: PriorTransformer
321-
- local: api/models/sd3_transformer2d
322-
title: SD3Transformer2DModel
323327
- local: api/models/sana_transformer2d
324328
title: SanaTransformer2DModel
329+
- local: api/models/sd3_transformer2d
330+
title: SD3Transformer2DModel
325331
- local: api/models/stable_audio_transformer
326332
title: StableAudioDiTModel
327333
- local: api/models/transformer2d
@@ -336,10 +342,10 @@
336342
title: StableCascadeUNet
337343
- local: api/models/unet
338344
title: UNet1DModel
339-
- local: api/models/unet2d
340-
title: UNet2DModel
341345
- local: api/models/unet2d-cond
342346
title: UNet2DConditionModel
347+
- local: api/models/unet2d
348+
title: UNet2DModel
343349
- local: api/models/unet3d-cond
344350
title: UNet3DConditionModel
345351
- local: api/models/unet-motion
@@ -348,6 +354,10 @@
348354
title: UViT2DModel
349355
title: UNets
350356
- sections:
357+
- local: api/models/asymmetricautoencoderkl
358+
title: AsymmetricAutoencoderKL
359+
- local: api/models/autoencoder_dc
360+
title: AutoencoderDC
351361
- local: api/models/autoencoderkl
352362
title: AutoencoderKL
353363
- local: api/models/autoencoderkl_allegro
@@ -364,10 +374,6 @@
364374
title: AutoencoderKLMochi
365375
- local: api/models/autoencoder_kl_wan
366376
title: AutoencoderKLWan
367-
- local: api/models/asymmetricautoencoderkl
368-
title: AsymmetricAutoencoderKL
369-
- local: api/models/autoencoder_dc
370-
title: AutoencoderDC
371377
- local: api/models/consistency_decoder_vae
372378
title: ConsistencyDecoderVAE
373379
- local: api/models/autoencoder_oobleck
@@ -420,6 +426,8 @@
420426
title: ControlNet with Stable Diffusion 3
421427
- local: api/pipelines/controlnet_sdxl
422428
title: ControlNet with Stable Diffusion XL
429+
- local: api/pipelines/controlnet_sana
430+
title: ControlNet-Sana
423431
- local: api/pipelines/controlnetxs
424432
title: ControlNet-XS
425433
- local: api/pipelines/controlnetxs_sdxl
@@ -444,6 +452,8 @@
444452
title: Flux
445453
- local: api/pipelines/control_flux_inpaint
446454
title: FluxControlInpaint
455+
- local: api/pipelines/hidream
456+
title: HiDream-I1
447457
- local: api/pipelines/hunyuandit
448458
title: Hunyuan-DiT
449459
- local: api/pipelines/hunyuan_video
@@ -511,40 +521,40 @@
511521
- sections:
512522
- local: api/pipelines/stable_diffusion/overview
513523
title: Overview
514-
- local: api/pipelines/stable_diffusion/text2img
515-
title: Text-to-image
524+
- local: api/pipelines/stable_diffusion/depth2img
525+
title: Depth-to-image
526+
- local: api/pipelines/stable_diffusion/gligen
527+
title: GLIGEN (Grounded Language-to-Image Generation)
528+
- local: api/pipelines/stable_diffusion/image_variation
529+
title: Image variation
516530
- local: api/pipelines/stable_diffusion/img2img
517531
title: Image-to-image
518532
- local: api/pipelines/stable_diffusion/svd
519533
title: Image-to-video
520534
- local: api/pipelines/stable_diffusion/inpaint
521535
title: Inpainting
522-
- local: api/pipelines/stable_diffusion/depth2img
523-
title: Depth-to-image
524-
- local: api/pipelines/stable_diffusion/image_variation
525-
title: Image variation
536+
- local: api/pipelines/stable_diffusion/k_diffusion
537+
title: K-Diffusion
538+
- local: api/pipelines/stable_diffusion/latent_upscale
539+
title: Latent upscaler
540+
- local: api/pipelines/stable_diffusion/ldm3d_diffusion
541+
title: LDM3D Text-to-(RGB, Depth), Text-to-(RGB-pano, Depth-pano), LDM3D Upscaler
526542
- local: api/pipelines/stable_diffusion/stable_diffusion_safe
527543
title: Safe Stable Diffusion
544+
- local: api/pipelines/stable_diffusion/sdxl_turbo
545+
title: SDXL Turbo
528546
- local: api/pipelines/stable_diffusion/stable_diffusion_2
529547
title: Stable Diffusion 2
530548
- local: api/pipelines/stable_diffusion/stable_diffusion_3
531549
title: Stable Diffusion 3
532550
- local: api/pipelines/stable_diffusion/stable_diffusion_xl
533551
title: Stable Diffusion XL
534-
- local: api/pipelines/stable_diffusion/sdxl_turbo
535-
title: SDXL Turbo
536-
- local: api/pipelines/stable_diffusion/latent_upscale
537-
title: Latent upscaler
538552
- local: api/pipelines/stable_diffusion/upscale
539553
title: Super-resolution
540-
- local: api/pipelines/stable_diffusion/k_diffusion
541-
title: K-Diffusion
542-
- local: api/pipelines/stable_diffusion/ldm3d_diffusion
543-
title: LDM3D Text-to-(RGB, Depth), Text-to-(RGB-pano, Depth-pano), LDM3D Upscaler
544554
- local: api/pipelines/stable_diffusion/adapter
545555
title: T2I-Adapter
546-
- local: api/pipelines/stable_diffusion/gligen
547-
title: GLIGEN (Grounded Language-to-Image Generation)
556+
- local: api/pipelines/stable_diffusion/text2img
557+
title: Text-to-image
548558
title: Stable Diffusion
549559
- local: api/pipelines/stable_unclip
550560
title: Stable unCLIP

docs/source/en/api/loaders/lora.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,11 +20,15 @@ LoRA is a fast and lightweight training method that inserts and trains a signifi
2020
- [`FluxLoraLoaderMixin`] provides similar functions for [Flux](https://huggingface.co/docs/diffusers/main/en/api/pipelines/flux).
2121
- [`CogVideoXLoraLoaderMixin`] provides similar functions for [CogVideoX](https://huggingface.co/docs/diffusers/main/en/api/pipelines/cogvideox).
2222
- [`Mochi1LoraLoaderMixin`] provides similar functions for [Mochi](https://huggingface.co/docs/diffusers/main/en/api/pipelines/mochi).
23+
- [`AuraFlowLoraLoaderMixin`] provides similar functions for [AuraFlow](https://huggingface.co/fal/AuraFlow).
2324
- [`LTXVideoLoraLoaderMixin`] provides similar functions for [LTX-Video](https://huggingface.co/docs/diffusers/main/en/api/pipelines/ltx_video).
2425
- [`SanaLoraLoaderMixin`] provides similar functions for [Sana](https://huggingface.co/docs/diffusers/main/en/api/pipelines/sana).
2526
- [`HunyuanVideoLoraLoaderMixin`] provides similar functions for [HunyuanVideo](https://huggingface.co/docs/diffusers/main/en/api/pipelines/hunyuan_video).
2627
- [`Lumina2LoraLoaderMixin`] provides similar functions for [Lumina2](https://huggingface.co/docs/diffusers/main/en/api/pipelines/lumina2).
28+
- [`WanLoraLoaderMixin`] provides similar functions for [Wan](https://huggingface.co/docs/diffusers/main/en/api/pipelines/wan).
29+
- [`CogView4LoraLoaderMixin`] provides similar functions for [CogView4](https://huggingface.co/docs/diffusers/main/en/api/pipelines/cogview4).
2730
- [`AmusedLoraLoaderMixin`] is for the [`AmusedPipeline`].
31+
- [`HiDreamImageLoraLoaderMixin`] provides similar functions for [HiDream Image](https://huggingface.co/docs/diffusers/main/en/api/pipelines/hidream)
2832
- [`LoraBaseMixin`] provides a base class with several utility methods to fuse, unfuse, unload, LoRAs and more.
2933

3034
<Tip>
@@ -56,6 +60,9 @@ To learn more about how to load LoRA weights, see the [LoRA](../../using-diffuse
5660
## Mochi1LoraLoaderMixin
5761

5862
[[autodoc]] loaders.lora_pipeline.Mochi1LoraLoaderMixin
63+
## AuraFlowLoraLoaderMixin
64+
65+
[[autodoc]] loaders.lora_pipeline.AuraFlowLoraLoaderMixin
5966

6067
## LTXVideoLoraLoaderMixin
6168

@@ -73,10 +80,22 @@ To learn more about how to load LoRA weights, see the [LoRA](../../using-diffuse
7380

7481
[[autodoc]] loaders.lora_pipeline.Lumina2LoraLoaderMixin
7582

83+
## CogView4LoraLoaderMixin
84+
85+
[[autodoc]] loaders.lora_pipeline.CogView4LoraLoaderMixin
86+
87+
## WanLoraLoaderMixin
88+
89+
[[autodoc]] loaders.lora_pipeline.WanLoraLoaderMixin
90+
7691
## AmusedLoraLoaderMixin
7792

7893
[[autodoc]] loaders.lora_pipeline.AmusedLoraLoaderMixin
7994

95+
## HiDreamImageLoraLoaderMixin
96+
97+
[[autodoc]] loaders.lora_pipeline.HiDreamImageLoraLoaderMixin
98+
8099
## LoraBaseMixin
81100

82101
[[autodoc]] loaders.lora_base.LoraBaseMixin
Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
2+
3+
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
4+
the License. You may obtain a copy of the License at
5+
6+
http://www.apache.org/licenses/LICENSE-2.0
7+
8+
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
9+
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10+
specific language governing permissions and limitations under the License.
11+
-->
12+
13+
# AutoModel
14+
15+
The `AutoModel` is designed to make it easy to load a checkpoint without needing to know the specific model class. `AutoModel` automatically retrieves the correct model class from the checkpoint `config.json` file.
16+
17+
```python
18+
from diffusers import AutoModel, AutoPipelineForText2Image
19+
20+
unet = AutoModel.from_pretrained("stable-diffusion-v1-5/stable-diffusion-v1-5", subfolder="unet")
21+
pipe = AutoPipelineForText2Image.from_pretrained("stable-diffusion-v1-5/stable-diffusion-v1-5", unet=unet)
22+
```
23+
24+
25+
## AutoModel
26+
27+
[[autodoc]] AutoModel
28+
- all
29+
- from_pretrained

docs/source/en/api/models/autoencoderkl_allegro.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ The model can be loaded with the following code snippet.
1818
```python
1919
from diffusers import AutoencoderKLAllegro
2020

21-
vae = AutoencoderKLCogVideoX.from_pretrained("rhymes-ai/Allegro", subfolder="vae", torch_dtype=torch.float32).to("cuda")
21+
vae = AutoencoderKLAllegro.from_pretrained("rhymes-ai/Allegro", subfolder="vae", torch_dtype=torch.float32).to("cuda")
2222
```
2323

2424
## AutoencoderKLAllegro

0 commit comments

Comments
 (0)
0