Fix: Use attn_implementation='eager' for MPS compatibility #78
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description:
When running the colqwen2 model on an MPS device (Apple Silicon), the default attention implementation causes compatibility issues.
The fix ensures that attn_implementation="eager" is used when the device is MPS, allowing stable execution.
Steps to Reproduce:
1. Run the model on a MacBook Pro with an MPS device.
# Initialize RAGMultiModalModel model = RAGMultiModalModel.from_pretrained( "vidore/colqwen2-v0.1", device="mps",)
2. Observe that the attention implementation may cause an error
Error
IndexError: Dimension out of range (expected to be in range of [-3, 2], but got 3)
Proposed Fix:
Update the from_pretrained method to set attn_implementation="eager" when running on MPS.
attn_implementation = "eager" if device == "mps" or ( isinstance(device, torch.device) and device.type == "mps" ) else None
File: byaldi/colpali.py
Location: Inside the from_pretrained method.
Rationale:
• MPS devices currently face issues with the default attention implementation.
• "eager" mode provides a stable alternative.