huggingface
diff --git a/‎docs/source/en/_toctree.yml‎
Lines changed: 4 additions & 0 deletions b/‎docs/source/en/_toctree.yml‎
Lines changed: 4 additions & 0 deletions
diff --git a/‎docs/source/en/api/models/ernie_image_transformer2d.md‎
Lines changed: 21 additions & 0 deletions b/‎docs/source/en/api/models/ernie_image_transformer2d.md‎
Lines changed: 21 additions & 0 deletions
diff --git a/‎docs/source/en/api/pipelines/ernie_image.md‎
Lines changed: 86 additions & 0 deletions b/‎docs/source/en/api/pipelines/ernie_image.md‎
Lines changed: 86 additions & 0 deletions
diff --git a/‎src/diffusers/__init__.py‎
Lines changed: 4 additions & 0 deletions b/‎src/diffusers/__init__.py‎
Lines changed: 4 additions & 0 deletions
diff --git a/‎src/diffusers/models/__init__.py‎
Lines changed: 2 additions & 0 deletions b/‎src/diffusers/models/__init__.py‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎src/diffusers/models/transformers/__init__.py‎
Lines changed: 1 addition & 0 deletions b/‎src/diffusers/models/transformers/__init__.py‎
Lines changed: 1 addition & 0 deletions
@@ -350,6 +350,8 @@
         title: DiTTransformer2DModel
       - local: api/models/easyanimate_transformer3d
         title: EasyAnimateTransformer3DModel
+      - local: api/models/ernie_image_transformer2d
+        title: ErnieImageTransformer2DModel
       - local: api/models/flux2_transformer
         title: Flux2Transformer2DModel
       - local: api/models/flux_transformer
@@ -534,6 +536,8 @@
         title: DiT
       - local: api/pipelines/easyanimate
         title: EasyAnimate
+      - local: api/pipelines/ernie_image
+        title: ERNIE-Image
       - local: api/pipelines/flux
         title: Flux
       - local: api/pipelines/flux2
 
@@ -0,0 +1,21 @@
+<!--Copyright 2025 The HuggingFace Team. All rights reserved.
+
+Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
+the License. You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
+an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
+specific language governing permissions and limitations under the License.
+-->
+
+# ErnieImageTransformer2DModel
+
+A Transformer model for image-like data from [ERNIE-Image](https://huggingface.co/baidu/ERNIE-Image).
+
+A Transformer model for image-like data from [ERNIE-Image-Turbo](https://huggingface.co/baidu/ERNIE-Image-Turbo).
+
+## ErnieImageTransformer2DModel
+
+[[autodoc]] ErnieImageTransformer2DModel
@@ -0,0 +1,86 @@
+<!--Copyright 2025 The HuggingFace Team. All rights reserved.
+
+Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
+the License. You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
+an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
+specific language governing permissions and limitations under the License.
+-->
+
+# Ernie-Image
+
+<div class="flex flex-wrap space-x-1">
+  <img alt="LoRA" src="https://img.shields.io/badge/LoRA-d8b4fe?style=flat"/>
+</div>
+
+[ERNIE-Image] is a powerful and highly efficient image generation model with 8B parameters. Currently there's only two models to be released:
+
+|Model|Hugging Face|
+|---|---|
+|ERNIE-Image|https://huggingface.co/baidu/ERNIE-Image|
+|ERNIE-Image-Turbo|https://huggingface.co/baidu/ERNIE-Image-Turbo|
+
+## ERNIE-Image
+
+ERNIE-Image is designed with a relatively compact architecture and solid instruction-following capability, emphasizing parameter efficiency. Based on an 8B DiT backbone, it provides performance that is comparable in some scenarios to larger (20B+) models, while maintaining reasonable parameter efficiency. It offers a relatively stable level of performance in instruction understanding and execution, text generation (e.g., English / Chinese / Japanese), and overall stability.
+
+## ERNIE-Image-Turbo
+
+ERNIE-Image-Turbo is a distilled variant of ERNIE-Image, requiring only 8 NFEs (Number of Function Evaluations) and offering a more efficient alternative with relatively comparable performance to the full model in certain cases.
+
+## ErnieImagePipeline
+
+Use [ErnieImagePipeline] to generate images from text prompts. The pipeline supports Prompt Enhancer (PE) by default, which enhances the user’s raw prompt to improve output quality, though it may reduce instruction-following accuracy.
+
+We provide a pretrained 3B-parameter PE model; however, using larger language models (e.g., Gemini or ChatGPT) for prompt enhancement may yield better results. The system prompt template is available at: https://huggingface.co/baidu/ERNIE-Image/blob/main/pe/chat_template.jinja.
+
+If you prefer not to use PE, set use_pe=False.
+
+```python
+import torch
+from diffusers import ErnieImagePipeline
+from diffusers.utils import load_image
+
+pipe = ErnieImagePipeline.from_pretrained("baidu/ERNIE-Image", torch_dtype=torch.bfloat16)
+pipe.to("cuda")
+# If you are running low on GPU VRAM, you can enable offloading
+pipe.enable_model_cpu_offload()
+
+prompt = "一只黑白相间的中华田园犬"
+images = pipe(
+    prompt=prompt,
+    height=1024,
+    width=1024,
+    num_inference_steps=50,
+    guidance_scale=4.0,
+    generator=torch.Generator("cuda").manual_seed(42),
+    use_pe=True,
+).images
+images[0].save("ernie-image-output.png")
+```
+
+```python
+import torch
+from diffusers import ErnieImagePipeline
+from diffusers.utils import load_image
+
+pipe = ErnieImagePipeline.from_pretrained("baidu/ERNIE-Image-Turbo", torch_dtype=torch.bfloat16)
+pipe.to("cuda")
+# If you are running low on GPU VRAM, you can enable offloading
+pipe.enable_model_cpu_offload()
+
+prompt = "一只黑白相间的中华田园犬"
+images = pipe(
+    prompt=prompt,
+    height=1024,
+    width=1024,
+    num_inference_steps=8,
+    guidance_scale=1.0,
+    generator=torch.Generator("cuda").manual_seed(42),
+    use_pe=True,
+).images
+images[0].save("ernie-image-turbo-output.png")
+```
@@ -235,6 +235,7 @@
             "CosmosTransformer3DModel",
             "DiTTransformer2DModel",
             "EasyAnimateTransformer3DModel",
+            "ErnieImageTransformer2DModel",
             "Flux2Transformer2DModel",
             "FluxControlNetModel",
             "FluxMultiControlNetModel",
@@ -527,6 +528,7 @@
             "EasyAnimateControlPipeline",
             "EasyAnimateInpaintPipeline",
             "EasyAnimatePipeline",
+            "ErnieImagePipeline",
             "Flux2KleinKVPipeline",
             "Flux2KleinPipeline",
             "Flux2Pipeline",
@@ -1037,6 +1039,7 @@
             CosmosTransformer3DModel,
             DiTTransformer2DModel,
             EasyAnimateTransformer3DModel,
+            ErnieImageTransformer2DModel,
             Flux2Transformer2DModel,
             FluxControlNetModel,
             FluxMultiControlNetModel,
@@ -1304,6 +1307,7 @@
             EasyAnimateControlPipeline,
             EasyAnimateInpaintPipeline,
             EasyAnimatePipeline,
+            ErnieImagePipeline,
             Flux2KleinKVPipeline,
             Flux2KleinPipeline,
             Flux2Pipeline,
 
@@ -101,6 +101,7 @@
     _import_structure["transformers.transformer_cogview4"] = ["CogView4Transformer2DModel"]
     _import_structure["transformers.transformer_cosmos"] = ["CosmosTransformer3DModel"]
     _import_structure["transformers.transformer_easyanimate"] = ["EasyAnimateTransformer3DModel"]
+    _import_structure["transformers.transformer_ernie_image"] = ["ErnieImageTransformer2DModel"]
     _import_structure["transformers.transformer_flux"] = ["FluxTransformer2DModel"]
     _import_structure["transformers.transformer_flux2"] = ["Flux2Transformer2DModel"]
     _import_structure["transformers.transformer_glm_image"] = ["GlmImageTransformer2DModel"]
@@ -219,6 +220,7 @@
             DiTTransformer2DModel,
             DualTransformer2DModel,
             EasyAnimateTransformer3DModel,
+            ErnieImageTransformer2DModel,
             Flux2Transformer2DModel,
             FluxTransformer2DModel,
             GlmImageTransformer2DModel,
 
@@ -25,6 +25,7 @@
     from .transformer_cogview4 import CogView4Transformer2DModel
     from .transformer_cosmos import CosmosTransformer3DModel
     from .transformer_easyanimate import EasyAnimateTransformer3DModel
+    from .transformer_ernie_image import ErnieImageTransformer2DModel
     from .transformer_flux import FluxTransformer2DModel
     from .transformer_flux2 import Flux2Transformer2DModel
     from .transformer_glm_image import GlmImageTransformer2DModel