Skip to content

Conversation

murshidm
Copy link

Description:
When running the colqwen2 model on an MPS device (Apple Silicon), the default attention implementation causes compatibility issues.
The fix ensures that attn_implementation="eager" is used when the device is MPS, allowing stable execution.

Steps to Reproduce:
1. Run the model on a MacBook Pro with an MPS device.
# Initialize RAGMultiModalModel model = RAGMultiModalModel.from_pretrained( "vidore/colqwen2-v0.1", device="mps",)
2. Observe that the attention implementation may cause an error

Error
IndexError: Dimension out of range (expected to be in range of [-3, 2], but got 3)

Proposed Fix:
Update the from_pretrained method to set attn_implementation="eager" when running on MPS.
attn_implementation = "eager" if device == "mps" or ( isinstance(device, torch.device) and device.type == "mps" ) else None

File: byaldi/colpali.py

Location: Inside the from_pretrained method.

Rationale:
• MPS devices currently face issues with the default attention implementation.
• "eager" mode provides a stable alternative.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant