Skip to content

inlineData image parts dropped when using /run endpoint #4975

@envyram

Description

@envyram

Bug Report: inlineData image parts dropped when using /run endpoint

Description

When sending a multimodal message (text + image) to the /run endpoint, the inlineData part is not forwarded to the underlying LLM. The image arrives at the ADK server correctly but is silently dropped before the LiteLLM call, resulting in the model receiving an empty images: [] array.

Environment

  • ADK version: latest
  • LiteLLM version: 1.82.4
  • Model: ollama_chat/ministral-3:8b via Ollama
  • Ollama: local instance at localhost:11434

Steps to Reproduce

  1. Start an ADK agent with a vision-capable Ollama model:
root_agent = Agent(
    model=LiteLlm(model="ollama_chat/ministral-3:8b"),
    name="assistant_agent",
    ...
)
  1. Send a multimodal request to /run with both text and image parts:
{
  "appName": "default",
  "userId": "user",
  "sessionId": "test",
  "newMessage": {
    "role": "user",
    "parts": [
      {"text": "What is in this image?"},
      {
        "inlineData": {
          "mimeType": "image/png",
          "data": "<base64-encoded-image>"
        }
      }
    ]
  },
  "streaming": false
}
  1. Observe the model responds as if no image was provided.

Debug Evidence

With

import litellm
litellm._turn_on_debug()

it can be seen that the outgoing request to Ollama shows the image array is empty despite being present in the /run payload:

POST http://localhost:11434/api/chat
{
  "messages": [
    {"role": "user", "content": "What is in this image?", "images": []}
  ]
}

The inlineData base64 bytes are never populated into the images field.

Expected Behavior

The inlineData part from the /run payload should be converted and forwarded as base64 image data in the outgoing LLM request.

Actual Behavior

The images field is sent as an empty array []. The model responds as if no image was attached.

Additional Notes

The same model (ministral-3:8b) correctly processes images when called directly via the Ollama native UI or via the Ollama Python SDK, confirming the model itself supports vision. The issue is specific to ADK's message conversion layer between the /run API and the LiteLLM call.

Metadata

Metadata

Assignees

No one assigned

    Labels

    models[Component] Issues related to model support

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions