Skip to content

Conversation

@DrishtiShrrrma
Copy link
Contributor

@DrishtiShrrrma DrishtiShrrrma commented Oct 28, 2025

Refs DrishtiShrrrma#2

Summary

This PR updates the Transformers Version Recommendation so that LLaVA-Next models don’t default to “latest”. We pin to tested, working versions. This prevents a reproducible crash where the image processor receives the literal placeholder.

What changed

  • Replace the bullet that says transformers==latest for LLaVA-Next” with:
  • Use transformers==4.48.0 (recommended) or transformers==4.46.0 for LLaVA-Next series (e.g., llava-hf/llava-v1.6-vicuna-7b-hf).
  • Keep “transformers==latest” for the other model families listed in the README.

Why
Newer transformers builds change image handling and, in our testing, cause LLaVA-Next evaluation to pass the literal placeholder to the processor, leading to:

ValueError: Incorrect image source. Must be a valid URL starting with `http://`/ or `https://`/, a valid path to an image file, or a base64 encoded string. Got USER: <image>

Observations (tested)

transformers Result Error / Notes
4.46.0 ✅ Works Stable across tested benchmarks
4.48.0 ✅ Works Stable across tested benchmarks
4.57.1 ❌ Fails ValueError: Incorrect image source. Must be a valid URL starting with http://orhttps://, a valid path to an image file, or a base64 encoded string. Got USER: <image>
5.0.0.dev0 ❌ Fails ValueError: Incorrect image source. Must be a valid URL starting with http://orhttps://, a valid path to an image file, or a base64 encoded string. Got USER: <image>

Benchmarks tested

CountBenchQA, MMBench_DEV_EN, MME, SEEDBench_IMG

Model tested

llava-hf/llava-v1.6-vicuna-7b-hf (key: llava_next_vicuna_7b)

Minimal repro

# Failing case (example)
pip install "transformers==4.57.1"
python run.py --data CountBenchQA --model llava_next_vicuna_7b --verbose
# -> ValueError: Incorrect image source ... Got USER: <image>

# Working case (example)
pip install "transformers==4.48.0"
python run.py --data CountBenchQA --model llava_next_vicuna_7b --verbose
# -> runs successfully

Environment

  • Platform: Google Colab (Python 3.12.12)
  • PyTorch: 2.8.0+cu126 | CUDA: 12.6
  • GPU: NVIDIA L4

docs(readme): pin transformers==4.48.0 (or 4.46.0) for LLaVA-Next
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant