Add a format provider for VLLM models. #3601
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Some VLLM models, e.g. gemma 3 12B, expect some strict ordering on message roles, i.e. these should be
system,user,assistant,user,assistant, etc. I haven't been able to get this to work with any of the existingprovider_fmtoptions,mistrlaiensures that the last message is ausermessage but doesn't compact the existinguserorassistantmessages.This PR adds a new format provider option,
vllm, which ensures that the roles are presented in the proper ordering. Consecutiveusermessages are compacted together. This was tested to work properly with vllm 0.9.1 and various models (gemma 3, mistral).Not sure if this the right way to solve this issue so let me know if there is a better alternative.