Skip to content

[Bug] Ollama requests fail when including an Image #8067

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
kmeehl opened this issue Apr 13, 2025 · 2 comments
Open

[Bug] Ollama requests fail when including an Image #8067

kmeehl opened this issue Apr 13, 2025 · 2 comments
Labels
bug Something isn't working

Comments

@kmeehl
Copy link

kmeehl commented Apr 13, 2025

What happened?

Hi! Very interesting project.

I'm trying to use dspy to do image classification. As a first step, I'd like to just generate a description of an image, using minicpm-v model on ollama.

The following example results in an error:

import dspy

class Describe(dspy.Signature):
    """Describe the image in detail. Respond only in English."""

    image: dspy.Image = dspy.InputField(desc="A photo")
    description: str = dspy.OutputField(desc="Detailed description of the image.")

image_path="/tmp/9221487.jpg"
minicpm = dspy.LM('ollama/minicpm-v:latest', api_base='http://localhost:11434', api_key='')

p = dspy.Predict(Describe)
p.set_lm(minicpm)
result = p(image=dspy.Image.from_url(image_path))
print(result.description)

Output:

2025/04/13 13:22:40 WARNING dspy.adapters.json_adapter: Failed to use structured output format. Falling back to JSON mode. Error: litellm.BadRequestError: Invalid Message passed in {'role': 'system', 'content': 'Your input fields are:\n1. `image` (Image): A photo\nYour output fields are:\n1. `description` (str): Detailed description of the image.\nAll interactions will be structured in the following way, with the appropriate values filled in.\n\nInputs will have the following structure:\n\n[[ ## image ## ]]\n{image}\n\nOutputs will be a JSON object with the following fields.\n\n[[ ## description ## ]]\n{description}\nIn adhering to this structure, your objective is: \n        Describe the image in detail. Respond only in English.'}
Traceback (most recent call last):
  File "~/.local/lib/python3.12/site-packages/dspy/adapters/chat_adapter.py", line 41, in __call__
    return super().__call__(lm, lm_kwargs, signature, demos, inputs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/dspy/adapters/base.py", line 33, in __call__
    outputs = lm(messages=inputs, **lm_kwargs)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/dspy/utils/callback.py", line 266, in wrapper
    return fn(instance, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/dspy/clients/base_lm.py", line 52, in __call__
    response = self.forward(prompt=prompt, messages=messages, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/dspy/utils/callback.py", line 266, in wrapper
    return fn(instance, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/dspy/clients/lm.py", line 112, in forward
    results = completion(
              ^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/dspy/clients/lm.py", line 268, in wrapper
    output = func_cached(key, request, *args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/cachetools/_decorators.py", line 94, in wrapper
    v = func(*args, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/dspy/clients/lm.py", line 257, in func_cached
    return func(request, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/dspy/clients/lm.py", line 282, in cached_litellm_completion
    return litellm_completion(
           ^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/dspy/clients/lm.py", line 301, in litellm_completion
    return litellm.completion(
           ^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/litellm/utils.py", line 1213, in wrapper
    raise e
  File "~/.local/lib/python3.12/site-packages/litellm/utils.py", line 1091, in wrapper
    result = original_function(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/litellm/main.py", line 3093, in completion
    raise exception_type(
  File "~/.local/lib/python3.12/site-packages/litellm/main.py", line 2815, in completion
    response = base_llm_http_handler.completion(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/litellm/llms/custom_httpx/llm_http_handler.py", line 239, in completion
    data = provider_config.transform_request(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/litellm/llms/ollama/completion/transformation.py", line 315, in transform_request
    modified_prompt = ollama_pt(model=model, messages=messages)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/litellm/litellm_core_utils/prompt_templates/factory.py", line 265, in ollama_pt
    raise litellm.BadRequestError(
litellm.exceptions.BadRequestError: litellm.BadRequestError: Invalid Message passed in {'role': 'system', 'content': 'Your input fields are:\n1. `image` (Image): A photo\nYour output fields are:\n1. `description` (str): Detailed description of the image.\nAll interactions will be structured in the following way, with the appropriate values filled in.\n\n[[ ## image ## ]]\n{image}\n\n[[ ## description ## ]]\n{description}\n\n[[ ## completed ## ]]\nIn adhering to this structure, your objective is: \n        Describe the image in detail. Respond only in English.'}


.
.
.

Traceback (most recent call last):
  File "~/.local/lib/python3.12/site-packages/dspy/adapters/json_adapter.py", line 67, in __call__
    return super().__call__(lm, lm_kwargs, signature, demos, inputs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/dspy/adapters/chat_adapter.py", line 49, in __call__
    raise e
  File "~/.local/lib/python3.12/site-packages/dspy/adapters/chat_adapter.py", line 41, in __call__
    return super().__call__(lm, lm_kwargs, signature, demos, inputs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/dspy/adapters/base.py", line 33, in __call__
    outputs = lm(messages=inputs, **lm_kwargs)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/dspy/utils/callback.py", line 266, in wrapper
    return fn(instance, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/dspy/clients/base_lm.py", line 52, in __call__
    response = self.forward(prompt=prompt, messages=messages, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/dspy/utils/callback.py", line 266, in wrapper
    return fn(instance, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/dspy/clients/lm.py", line 112, in forward
    results = completion(
              ^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/dspy/clients/lm.py", line 268, in wrapper
    output = func_cached(key, request, *args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/cachetools/_decorators.py", line 94, in wrapper
    v = func(*args, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/dspy/clients/lm.py", line 257, in func_cached
    return func(request, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/dspy/clients/lm.py", line 282, in cached_litellm_completion
    return litellm_completion(
           ^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/dspy/clients/lm.py", line 301, in litellm_completion
    return litellm.completion(
           ^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/litellm/utils.py", line 1213, in wrapper
    raise e
  File "~/.local/lib/python3.12/site-packages/litellm/utils.py", line 1091, in wrapper
    result = original_function(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/litellm/main.py", line 3093, in completion
    raise exception_type(
  File "~/.local/lib/python3.12/site-packages/litellm/main.py", line 2815, in completion
    response = base_llm_http_handler.completion(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/litellm/llms/custom_httpx/llm_http_handler.py", line 239, in completion
    data = provider_config.transform_request(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/litellm/llms/ollama/completion/transformation.py", line 315, in transform_request
    modified_prompt = ollama_pt(model=model, messages=messages)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/litellm/litellm_core_utils/prompt_templates/factory.py", line 265, in ollama_pt
    raise litellm.BadRequestError(
litellm.exceptions.BadRequestError: litellm.BadRequestError: Invalid Message passed in {'role': 'system', 'content': 'Your input fields are:\n1. `image` (Image): A photo\nYour output fields are:\n1. `description` (str): Detailed description of the image.\nAll interactions will be structured in the following way, with the appropriate values filled in.\n\nInputs will have the following structure:\n\n[[ ## image ## ]]\n{image}\n\nOutputs will be a JSON object with the following fields.\n\n[[ ## description ## ]]\n{description}\nIn adhering to this structure, your objective is: \n        Describe the image in detail. Respond only in English.'}

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "~/projects/1central/image_classifier/image.py", line 32, in <module>
    result = p(image=dspy.Image.from_url(image_path))
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/dspy/utils/callback.py", line 266, in wrapper
    return fn(instance, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/dspy/predict/predict.py", line 77, in __call__
    return self.forward(**kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/dspy/predict/predict.py", line 107, in forward
    completions = adapter(
                  ^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/dspy/adapters/chat_adapter.py", line 50, in __call__
    return JSONAdapter()(lm, lm_kwargs, signature, demos, inputs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/.local/lib/python3.12/site-packages/dspy/adapters/json_adapter.py", line 69, in __call__
    raise RuntimeError(
RuntimeError: Both structured output format and JSON mode failed. Please choose a model that supports `response_format` argument. Original error: litellm.BadRequestError: Invalid Message passed in {'role': 'system', 'content': 'Your input fields are:\n1. `image` (Image): A photo\nYour output fields are:\n1. `description` (str): Detailed description of the image.\nAll interactions will be structured in the following way, with the appropriate values filled in.\n\nInputs will have the following structure:\n\n[[ ## image ## ]]\n{image}\n\nOutputs will be a JSON object with the following fields.\n\n[[ ## description ## ]]\n{description}\nIn adhering to this structure, your objective is: \n        Describe the image in detail. Respond only in English.'}

I've also tried the same code, replacing the model provider with ollama_chat:
minicpm = dspy.LM('ollama_chat/minicpm-v:latest', api_base='http://localhost:11434', api_key='')

This results in the same error.

Questions

  • Are there any known limitations to dspy when using local LLMs with Ollama?
  • Are there additional configurations, or alternate strategies I should try?
  • Any tips or directions you can point me in for debugging this?

Additional Info
ollama version: 0.6.5
dspy version: 2.6.17

Steps to reproduce

Copy and run the example code above, changing image_path to point to a real image on your hard drive.

DSPy version

2.6.17

@kmeehl kmeehl added the bug Something isn't working label Apr 13, 2025
@okhat
Copy link
Collaborator

okhat commented Apr 14, 2025

Try dspy.LM('ollama_chat/...') ?

@kmeehl
Copy link
Author

kmeehl commented Apr 14, 2025

Hi @okhat , I tried that. That's the second error output that I added above.

I actually thought they were different errors, but upon running it again, they appear to be the same. I'll edit my post to reflect that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants