Fix mask handling for batch generation in HunyuanVideo example #546

tjkemp · 2025-07-31T08:26:59Z

Summary

Batch inference was using the mask from first item (encoder_attention_mask[0]) in hunyan_video_usp_example.py, which works for bs=1 but cuts off masks for longers prompts and produces artifacts. This PR provides a fix.

How to reproduce

Apply patch to make script save all videos

diff --git a/examples/hunyuan_video_usp_example.py b/examples/hunyuan_video_usp_example.py
index 03856f1..a9041ac 100644
--- a/examples/hunyuan_video_usp_example.py
+++ b/examples/hunyuan_video_usp_example.py
@@ -297,7 +291,7 @@ def main():
         guidance_scale=input_config.guidance_scale,
         generator=torch.Generator(device="cuda").manual_seed(
             input_config.seed),
-    ).frames[0]
+    )
 
     end_time = time.time()
     elapsed_time = end_time - start_time
@@ -311,9 +305,10 @@ def main():
     )
     if is_dp_last_group():
         resolution = f"{input_config.width}x{input_config.height}"
-        output_filename = f"results/hunyuan_video_{parallel_info}_{resolution}.mp4"
-        export_to_video(output, output_filename, fps=15)
-        print(f"output saved to {output_filename}")
+        for idx, frames in enumerate(output.frames, start=1):
+            output_filename = f"results/hunyuan_video_{idx:02d}_{parallel_info}_{resolution}.mp4"
+            export_to_video(frames, output_filename, fps=15)
+            print(f"output saved to {output_filename}")
 
     if get_world_group().rank == get_world_group().world_size - 1:
         print(

Run an example with an added second prompt.

mkdir -p results && torchrun --nproc_per_node=2 examples/hunyuan_video_usp_example.py --model tencent/HunyuanVideo --ulysses_degree 2 --num_inference_steps 30 --warmup_steps 0 --prompt "A husky puppy plays with its own tail." "Two Siamese cats eat sushi from a plate." --height 320 --width 512 --num_frames 61 --enable_tiling --enable_model_cpu_offload

Compare results

The second video before this fix:

The second video after applying this fix:

Notes

Unrelated to this change, I’m seeing:

[rank1]:   File "/app/xDiT/examples/hunyuan_video_usp_example.py", line 297, in main
[rank1]:     guidance_scale=input_config.guidance_scale,
[rank1]:                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: AttributeError: 'InputConfig' object has no attribute 'guidance_scale'

feifeibear · 2025-08-04T02:32:47Z

guidance_scale error probably comes from capability of diffusers version

tjkemp · 2025-09-02T13:16:34Z

Is there anything else I could adjust to help get this PR ready for merge?

Fix attention mask handling in batch generation

083e198

tjkemp changed the title ~~Fix mask handling for batch generation~~ Fix mask handling for batch generation in HunyuanVideo example Jul 31, 2025

Fix hard coded batch_size

a444773

eppaneamd mentioned this pull request Sep 3, 2025

Fix compile in HunyuanVideo example #556

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix mask handling for batch generation in HunyuanVideo example #546

Fix mask handling for batch generation in HunyuanVideo example #546

Uh oh!

tjkemp commented Jul 31, 2025

Uh oh!

feifeibear commented Aug 4, 2025

Uh oh!

tjkemp commented Sep 2, 2025

Uh oh!

Uh oh!

Fix mask handling for batch generation in HunyuanVideo example #546

Are you sure you want to change the base?

Fix mask handling for batch generation in HunyuanVideo example #546

Uh oh!

Conversation

tjkemp commented Jul 31, 2025

Summary

How to reproduce

Notes

Uh oh!

feifeibear commented Aug 4, 2025

Uh oh!

tjkemp commented Sep 2, 2025

Uh oh!

Uh oh!