-
Notifications
You must be signed in to change notification settings - Fork 1.8k
how to set tensor_parallel_size for vllm backend #8055
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
if this is a LM arg, you can set it in dspy.LM() or you can configure within in the model you launch with vLLM before querying. |
@Jasonsey Is that a flag to pass when launching vLLM? Or sent from the client in each request? |
Here is how we can use vllm for tensor_parallel_size param: from vllm import LLM
llm = LLM("facebook/opt-13b", tensor_parallel_size=4)
output = llm.generate("San Francisco is a") |
Yes. Please pass this when launching the vLLM server. Not related to DSPy. |
What happened?
my code is here,
my question is how to set tensor_parallel_size for vllm backend? this code isnot working for this param
Steps to reproduce
DSPy version
2.6.17
The text was updated successfully, but these errors were encountered: