Skip to content

Fix AttentionModuleMixin.set_attention_backend skipping hub kernel download#13783

Open
adityasingh2400 wants to merge 1 commit into
huggingface:mainfrom
adityasingh2400:fix-set-attention-backend-kernel-download
Open

Fix AttentionModuleMixin.set_attention_backend skipping hub kernel download#13783
adityasingh2400 wants to merge 1 commit into
huggingface:mainfrom
adityasingh2400:fix-set-attention-backend-kernel-download

Conversation

@adityasingh2400
Copy link
Copy Markdown

What does this PR do?

Fixes #13284.

AttentionModuleMixin.set_attention_backend (src/diffusers/models/attention.py) validated the backend name and set self.processor._attention_backend, but skipped the _check_attention_backend_requirements(...) and _maybe_download_kernel_for_backend(...) calls that ModelMixin.set_attention_backend (src/diffusers/models/modeling_utils.py) performs. As a result, applying a hub backend like sage_hub only to individual attention submodules silently left the kernel registry uninitialized and inference crashed later with TypeError: 'NoneType' object is not callable inside dispatch_attention_fn.

The fix mirrors the requirement-check and kernel-download steps into the submodule setter so per-block backend overrides work for hub kernels too.

Reproduction

The repro from the issue (Wan-AI/Wan2.1-T2V-1.3B-Diffusers + per-submodule sage_hub) now runs through inference without the NoneType crash, and the underlying _HUB_KERNELS_REGISTRY entry for sage_hub is populated as expected.

Tests

Added a regression test in tests/models/test_attention_processor.py::AttentionModuleMixinSetBackendTests that:

  • calls set_attention_backend("sage_hub") on a minimal AttentionModuleMixin instance and asserts both _check_attention_backend_requirements and _maybe_download_kernel_for_backend are invoked with the resolved backend (using unittest.mock.patch so no kernel download or GPU is required), and that processor._attention_backend is updated;
  • asserts that passing an unknown backend name still raises ValueError.

Both tests pass on the fix and fail on main (verified by temporarily reverting the source change).

Before submitting

  • This PR fixes a typo or improves the docs (no quality tests required)? No, bug fix.
  • Did you read the contributor guideline?
  • Did you write any new necessary tests? Yes.

Who can review?

@yiyixuxu @DN6 @sayakpaul

…wnload

The per-submodule set_attention_backend on AttentionModuleMixin only
validated the backend name and updated self.processor._attention_backend,
but skipped the _check_attention_backend_requirements and
_maybe_download_kernel_for_backend calls that ModelMixin.set_attention_backend
performs. As a result, hub-based backends like sage_hub were silently set
without the kernel ever being downloaded, and inference failed later with
TypeError: 'NoneType' object is not callable inside dispatch_attention_fn.

This mirrors the requirement-check and kernel-download path from ModelMixin
into the submodule-level setter so per-block backend overrides work for
hub kernels.

Fixes huggingface#13284
@github-actions github-actions Bot added fixes-issue size/S PR with diff < 50 LOC models tests and removed size/S PR with diff < 50 LOC labels May 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AttentionModuleMixin.set_attention_backend does not download hub kernels

1 participant