huggingface
diff --git a/‎docs/source/en/modular_diffusers/guiders.md‎
Lines changed: 1 addition & 15 deletions b/‎docs/source/en/modular_diffusers/guiders.md‎
Lines changed: 1 addition & 15 deletions
diff --git a/‎docs/source/en/modular_diffusers/modular_diffusers_states.md‎
Lines changed: 1 addition & 3 deletions b/‎docs/source/en/modular_diffusers/modular_diffusers_states.md‎
Lines changed: 1 addition & 3 deletions
diff --git a/‎docs/source/en/modular_diffusers/modular_pipeline.md‎
Lines changed: 242 additions & 173 deletions b/‎docs/source/en/modular_diffusers/modular_pipeline.md‎
Lines changed: 242 additions & 173 deletions
diff --git a/‎docs/source/en/modular_diffusers/pipeline_block.md‎
Lines changed: 115 additions & 45 deletions b/‎docs/source/en/modular_diffusers/pipeline_block.md‎
Lines changed: 115 additions & 45 deletions
diff --git a/‎docs/source/en/modular_diffusers/quickstart.md‎
Lines changed: 52 additions & 14 deletions b/‎docs/source/en/modular_diffusers/quickstart.md‎
Lines changed: 52 additions & 14 deletions
@@ -89,29 +89,15 @@ t2i_pipeline.guider
 
 ## Changing guider parameters
 
-The guider parameters can be adjusted with either the [`~ComponentSpec.create`] method or with [`~ModularPipeline.update_components`]. The example below changes the `guidance_scale` value.
+The guider parameters can be adjusted with the [`~ComponentSpec.create`] method and [`~ModularPipeline.update_components`]. The example below changes the `guidance_scale` value.
 
-<hfoptions id="switch">
-<hfoption id="create">
 
 ```py
 guider_spec = t2i_pipeline.get_component_spec("guider")
 guider = guider_spec.create(guidance_scale=10)
 t2i_pipeline.update_components(guider=guider)
 ```
 
-</hfoption>
-<hfoption id="update_components">
-
-```py
-guider_spec = t2i_pipeline.get_component_spec("guider")
-guider_spec.config["guidance_scale"] = 10
-t2i_pipeline.update_components(guider=guider_spec)
-```
-
-</hfoption>
-</hfoptions>
-
 ## Uploading custom guiders
 
 Call the [`~utils.PushToHubMixin.push_to_hub`] method on a custom guider to share it to the Hub.
 
@@ -25,9 +25,7 @@ This guide explains how states work and how they connect blocks.
 
 The [`~modular_pipelines.PipelineState`] is a global state container for all blocks. It maintains the complete runtime state of the pipeline and provides a structured way for blocks to read from and write to shared data.
 
-There are two dict's in [`~modular_pipelines.PipelineState`] for structuring data.
-
-- The `values` dict is a **mutable** state containing a copy of user provided input values and intermediate output values generated by blocks. If a block modifies an `input`, it will be reflected in the `values` dict after calling `set_block_state`.
+[`~modular_pipelines.PipelineState`] stores all data in a `values` dict, which is a **mutable** state containing user provided input values and intermediate output values generated by blocks. If a block modifies an `input`, it will be reflected in the `values` dict after calling `set_block_state`.
 
 ```py
 PipelineState(
 
@@ -25,81 +25,151 @@ This guide will show you how to create a [`~modular_pipelines.ModularPipelineBlo
 
 A [`~modular_pipelines.ModularPipelineBlocks`] requires `inputs`, and `intermediate_outputs`.
 
-- `inputs` are values provided by a user and retrieved from the [`~modular_pipelines.PipelineState`]. This is useful because some workflows resize an image, but the original image is still required. The [`~modular_pipelines.PipelineState`] maintains the original image.
+- `inputs` are values a block reads from the [`~modular_pipelines.PipelineState`] to perform its computation. These can be values provided by a user (like a prompt or image) or values produced by a previous block (like encoded `image_latents`). 
 
     Use `InputParam` to define `inputs`.
 
-    ```py
-    from diffusers.modular_pipelines import InputParam
-
-    user_inputs = [
-        InputParam(name="image", type_hint="PIL.Image", description="raw input image to process")
-    ]
-    ```
+```py
+class ImageEncodeStep(ModularPipelineBlocks):
+    ...
+
+    @property
+    def inputs(self):
+        return [
+            InputParam(name="image", type_hint="PIL.Image", required=True, description="raw input image to process"),
+        ]
+    ...
+```
 
 - `intermediate_outputs` are new values created by a block and added to the [`~modular_pipelines.PipelineState`]. The `intermediate_outputs` are available as `inputs` for subsequent blocks or available as the final output from running the pipeline.
 
     Use `OutputParam` to define `intermediate_outputs`.
 
-    ```py
-    from diffusers.modular_pipelines import OutputParam
+```py
+class ImageEncodeStep(ModularPipelineBlocks):
+    ...
 
-        user_intermediate_outputs = [
-        OutputParam(name="image_latents", description="latents representing the image")
-    ]
-    ```
+    @property
+    def intermediate_outputs(self):
+        return [
+            OutputParam(name="image_latents", description="latents representing the image"),
+        ]
+
+    ...
+```
 
 The intermediate inputs and outputs share data to connect blocks. They are accessible at any point, allowing you to track the workflow's progress.
 
+## Components and configs
+
+The components and pipeline-level configs a block needs are specified in [`ComponentSpec`] and [`~modular_pipelines.ConfigSpec`].
+
+- [`ComponentSpec`] contains the expected components used by a block. You need the `name` of the component and ideally a `type_hint` that specifies exactly what the component is.
+- [`~modular_pipelines.ConfigSpec`] contains pipeline-level settings that control behavior across all blocks.
+
+```py
+class ImageEncodeStep(ModularPipelineBlocks):
+    ...
+
+    @property
+    def expected_components(self):
+        return [
+            ComponentSpec(name="vae", type_hint=AutoencoderKL),
+        ]
+
+    @property
+    def expected_configs(self):
+        return [
+            ConfigSpec("force_zeros_for_empty_prompt", True),
+        ]
+
+    ...
+```
+
+When the blocks are converted into a pipeline, the components become available to the block as the first argument in `__call__`.
+
 ## Computation logic
 
 The computation a block performs is defined in the `__call__` method and it follows a specific structure.
 
-1. Retrieve the [`~modular_pipelines.BlockState`] to get a local view of the `inputs`
+1. Retrieve the [`~modular_pipelines.BlockState`] to get a local view of the `inputs`.
 2. Implement the computation logic on the `inputs`.
 3. Update [`~modular_pipelines.PipelineState`] to push changes from the local [`~modular_pipelines.BlockState`] back to the global [`~modular_pipelines.PipelineState`].
 4. Return the components and state which becomes available to the next block.
 
 ```py
-def __call__(self, components, state):
-    # Get a local view of the state variables this block needs
-    block_state = self.get_block_state(state)
+class ImageEncodeStep(ModularPipelineBlocks):
+
+    def __call__(self, components, state):
+        # Get a local view of the state variables this block needs
+        block_state = self.get_block_state(state)
 
-    # Your computation logic here
-    # block_state contains all your inputs
-    # Access them like: block_state.image, block_state.processed_image
+        # Your computation logic here
+        # block_state contains all your inputs
+        # Access them like: block_state.image, block_state.processed_image
 
-    # Update the pipeline state with your updated block_states
-    self.set_block_state(state, block_state)
-    return components, state
+        # Update the pipeline state with your updated block_states
+        self.set_block_state(state, block_state)
+        return components, state
 ```
 
-### Components and configs
+## Putting it all together
 
-The components and pipeline-level configs a block needs are specified in [`ComponentSpec`] and [`~modular_pipelines.ConfigSpec`].
+Here is the complete block with all the pieces connected.
 
-- [`ComponentSpec`] contains the expected components used by a block. You need the `name` of the component and ideally a `type_hint` that specifies exactly what the component is.
-- [`~modular_pipelines.ConfigSpec`] contains pipeline-level settings that control behavior across all blocks.
+```py
+from diffusers import ComponentSpec, AutoencoderKL
+from diffusers.modular_pipelines import InputParam, ModularPipelineBlocks, OutputParam
+
+
+class ImageEncodeStep(ModularPipelineBlocks):
+
+    @property
+    def description(self):
+        return "Encode an image into latent space."
+
+    @property
+    def expected_components(self):
+        return [
+            ComponentSpec(name="vae", type_hint=AutoencoderKL),
+        ]
+
+    @property
+    def inputs(self):
+        return [
+            InputParam(name="image", type_hint="PIL.Image", required=True, description="raw input image to process"),
+        ]
+
+    @property
+    def intermediate_outputs(self):
+        return [
+            OutputParam(name="image_latents", type_hint="torch.Tensor", description="latents representing the image"),
+        ]
+
+    def __call__(self, components, state):
+        block_state = self.get_block_state(state)
+        block_state.image_latents = components.vae.encode(block_state.image)
+        self.set_block_state(state, block_state)
+        return components, state
+```
+
+Every block has a `doc` property that is automatically generated from the properties you defined above. It provides a summary of the block's description, components, inputs, and outputs.
 
 ```py
-from diffusers import ComponentSpec, ConfigSpec
+block = ImageEncoderStep()
+print(block.doc)
+class ImageEncodeStep
 
-expected_components = [
-    ComponentSpec(name="unet", type_hint=UNet2DConditionModel),
-    ComponentSpec(name="scheduler", type_hint=EulerDiscreteScheduler)
-]
+  Encode an image into latent space.
 
-expected_config = [
-    ConfigSpec("force_zeros_for_empty_prompt", True)
-]
-```
+  Components:
+      vae (`AutoencoderKL`)
 
-When the blocks are converted into a pipeline, the components become available to the block as the first argument in `__call__`.
+  Inputs:
+      image (`PIL.Image`):
+          raw input image to process
 
-```py
-def __call__(self, components, state):
-    # Access components using dot notation
-    unet = components.unet
-    vae = components.vae
-    scheduler = components.scheduler
-```
+  Outputs:
+      image_latents (`torch.Tensor`):
+          latents representing the image
+```
@@ -39,17 +39,44 @@ image
 [`~ModularPipeline.from_pretrained`] uses lazy loading - it reads the configuration to learn where to load each component from, but doesn't actually load the model weights until you call [`~ModularPipeline.load_components`]. This gives you control over when and how components are loaded.
 
 > [!TIP]
-> [`ComponentsManager`] with `enable_auto_cpu_offload` automatically moves models between CPU and GPU as needed, reducing memory usage for large models like Qwen-Image. Learn more in the [ComponentsManager](./components_manager) guide.
+> `ComponentsManager` with `enable_auto_cpu_offload` automatically moves models between CPU and GPU as needed, reducing memory usage for large models like Qwen-Image. Learn more in the [ComponentsManager](./components_manager) guide.
+>
+> If you don't need offloading, remove the `components_manager` argument and move the pipeline to your device manually with `to("cuda")`.
 
 Learn more about creating and loading pipelines in the [Creating a pipeline](https://huggingface.co/docs/diffusers/modular_diffusers/modular_pipeline#creating-a-pipeline) and [Loading components](https://huggingface.co/docs/diffusers/modular_diffusers/modular_pipeline#loading-components) guides.
 
 ## Understand the structure
 
-A [`ModularPipeline`] has two parts:
-- **State**: the loaded components (models, schedulers, processors) and configuration
-- **Definition**: the [`ModularPipelineBlocks`] that specify inputs, outputs, expected components and computation logic
+A [`ModularPipeline`] has two parts: a **definition** (the blocks) and a **state** (the loaded components and configs).
 
-The blocks define *what* the pipeline does. Access them through `pipe.blocks`.
+Print the pipeline to see its state — the components and their loading status and configuration.
+```py
+print(pipe)
+```
+```
+QwenImageModularPipeline {
+  "_blocks_class_name": "QwenImageAutoBlocks",
+  "_class_name": "QwenImageModularPipeline",
+  "_diffusers_version": "0.37.0.dev0",
+  "transformer": [
+    "diffusers",
+    "QwenImageTransformer2DModel",
+    {
+      "pretrained_model_name_or_path": "Qwen/Qwen-Image",
+      "revision": null,
+      "subfolder": "transformer",
+      "type_hint": [
+        "diffusers",
+        "QwenImageTransformer2DModel"
+      ],
+      "variant": null
+    }
+  ],
+  ...
+}
+```
+
+Access the definition through `pipe.blocks` — this is the [`~modular_pipelines.ModularPipelineBlocks`] that defines the pipeline's workflows, inputs, outputs, and computation logic.
 ```py
 print(pipe.blocks)
 ```
@@ -87,7 +114,8 @@ The output returns:
 
 ### Workflows
 
-`QwenImageAutoBlocks` is a [`ConditionalPipelineBlocks`], so this pipeline supports multiple workflows and adapts its behavior based on the inputs you provide. For example, if you pass `image` to the pipeline, it runs an image-to-image workflow instead of text-to-image. Let's see this in action with an example.
+This pipeline supports multiple workflows and adapts its behavior based on the inputs you provide. For example, if you pass `image` to the pipeline, it runs an image-to-image workflow instead of text-to-image. Learn more about how this works under the hood in the [AutoPipelineBlocks](https://huggingface.co/docs/diffusers/modular_diffusers/auto_pipeline_blocks) guide.
+
 ```py
 from diffusers.utils import load_image
 
@@ -99,20 +127,21 @@ image = pipe(
 ).images[0]
 ```
 
-Use `get_workflow()` to extract the blocks for a specific workflow. Pass the workflow name (e.g., `"image2image"`, `"inpainting"`, `"controlnet_text2image"`) to get only the blocks relevant to that workflow.
+Use `get_workflow()` to extract the blocks for a specific workflow. Pass the workflow name (e.g., `"image2image"`, `"inpainting"`, `"controlnet_text2image"`) to get only the blocks relevant to that workflow. This is useful when you want to customize or debug a specific workflow. You can check `pipe.blocks.available_workflows` to see all available workflows.
 ```py
 img2img_blocks = pipe.blocks.get_workflow("image2image")
 ```
 
-Conditional blocks are convenient for users, but their conditional logic adds complexity when customizing or debugging. Extracting a workflow gives you the specific blocks relevant to your workflow, making it easier to work with. Learn more in the [AutoPipelineBlocks](https://huggingface.co/docs/diffusers/modular_diffusers/auto_pipeline_blocks) guide.
 
 ### Sub-blocks
 
 Blocks can contain other blocks. `pipe.blocks` gives you the top-level block definition (here, `QwenImageAutoBlocks`), while `sub_blocks` lets you access the smaller blocks inside it.
 
-`QwenImageAutoBlocks` is composed of: `text_encoder`, `vae_encoder`, `controlnet_vae_encoder`, `denoise`, and `decode`. Access them through the `sub_blocks` property.
+`QwenImageAutoBlocks` is composed of: `text_encoder`, `vae_encoder`, `controlnet_vae_encoder`, `denoise`, and `decode`.
 
-The `doc` property is useful for seeing the full documentation of any block, including its inputs, outputs, and components.
+These sub-blocks run one after another and data flows linearly from one block to the next — each block's `intermediate_outputs` become available as `inputs` to the next block. This is how [`SequentialPipelineBlocks`](./sequential_pipeline_blocks) work.
+
+You can access them through the `sub_blocks` property. The `doc` property is useful for seeing the full documentation of any block, including its inputs, outputs, and components.
 ```py
 vae_encoder_block = pipe.blocks.sub_blocks["vae_encoder"]
 print(vae_encoder_block.doc)
@@ -165,7 +194,7 @@ class CannyBlock
           Canny map for input image
 ```
 
-UUse `get_workflow` to extract the ControlNet workflow from [`QwenImageAutoBlocks`].
+Use `get_workflow` to extract the ControlNet workflow from [`QwenImageAutoBlocks`].
 ```py
 # Get the controlnet workflow that we want to work with
 blocks = pipe.blocks.get_workflow("controlnet_text2image")
@@ -182,9 +211,8 @@ class SequentialPipelineBlocks
       ...
 ```
 
-The extracted workflow is a [`SequentialPipelineBlocks`](./sequential_pipeline_blocks) - a multi-block type where blocks run one after another and data flows linearly from one block to the next. Each block's `intermediate_outputs` become available as `inputs` to subsequent blocks.
 
-Currently this workflow requires `control_image` as input. Let's insert the canny block at the beginning so the pipeline accepts a regular image instead.
+The extracted workflow is a [`SequentialPipelineBlocks`](./sequential_pipeline_blocks) and it currently requires `control_image` as input. Insert the canny block at the beginning so the pipeline accepts a regular image instead.
 ```py
 # Insert canny at the beginning
 blocks.sub_blocks.insert("canny", canny_block, 0)
@@ -211,7 +239,7 @@ class SequentialPipelineBlocks
 
 Now the pipeline takes `image` as input instead of `control_image`. Because blocks in a sequence share data automatically, the canny block's output (`control_image`) flows to the denoise block that needs it, and the canny block's input (`image`) becomes a pipeline input since no earlier block provides it.
 
-Create a pipeline from the modified blocks and load a ControlNet model.
+Create a pipeline from the modified blocks and load a ControlNet model. The ControlNet isn't part of the original model repository, so load it separately and add it with [`~ModularPipeline.update_components`].
 ```py
 pipeline = blocks.init_pipeline("Qwen/Qwen-Image", components_manager=manager)
 
@@ -241,6 +269,16 @@ output
 ## Next steps
 
 <hfoptions id="next">
+<hfoption id="Learn the basics">
+
+Understand the core building blocks of Modular Diffusers:
+
+- [ModularPipelineBlocks](./pipeline_block): The basic unit for defining a step in a pipeline.
+- [SequentialPipelineBlocks](./sequential_pipeline_blocks): Chain blocks to run in sequence.
+- [AutoPipelineBlocks](./auto_pipeline_blocks): Create pipelines that support multiple workflows.
+- [States](./modular_diffusers_states): How data is shared between blocks.
+
+</hfoption>
 <hfoption id="Build custom blocks">
 
 Learn how to create your own blocks with custom logic in the [Building Custom Blocks](./custom_blocks) guide.