Bug Report: inlineData image parts dropped when using /run endpoint
Description
When sending a multimodal message (text + image) to the /run endpoint, the inlineData part is not forwarded to the underlying LLM. The image arrives at the ADK server correctly but is silently dropped before the LiteLLM call, resulting in the model receiving an empty images: [] array.
Environment
- ADK version: latest
- LiteLLM version: 1.82.4
- Model:
ollama_chat/ministral-3:8b via Ollama
- Ollama: local instance at
localhost:11434
Steps to Reproduce
- Start an ADK agent with a vision-capable Ollama model:
root_agent = Agent(
model=LiteLlm(model="ollama_chat/ministral-3:8b"),
name="assistant_agent",
...
)
- Send a multimodal request to
/run with both text and image parts:
{
"appName": "default",
"userId": "user",
"sessionId": "test",
"newMessage": {
"role": "user",
"parts": [
{"text": "What is in this image?"},
{
"inlineData": {
"mimeType": "image/png",
"data": "<base64-encoded-image>"
}
}
]
},
"streaming": false
}
- Observe the model responds as if no image was provided.
Debug Evidence
With
import litellm
litellm._turn_on_debug()
it can be seen that the outgoing request to Ollama shows the image array is empty despite being present in the /run payload:
POST http://localhost:11434/api/chat
{
"messages": [
{"role": "user", "content": "What is in this image?", "images": []}
]
}
The inlineData base64 bytes are never populated into the images field.
Expected Behavior
The inlineData part from the /run payload should be converted and forwarded as base64 image data in the outgoing LLM request.
Actual Behavior
The images field is sent as an empty array []. The model responds as if no image was attached.
Additional Notes
The same model (ministral-3:8b) correctly processes images when called directly via the Ollama native UI or via the Ollama Python SDK, confirming the model itself supports vision. The issue is specific to ADK's message conversion layer between the /run API and the LiteLLM call.
Bug Report:
inlineDataimage parts dropped when using/runendpointDescription
When sending a multimodal message (text + image) to the
/runendpoint, theinlineDatapart is not forwarded to the underlying LLM. The image arrives at the ADK server correctly but is silently dropped before the LiteLLM call, resulting in the model receiving an emptyimages: []array.Environment
ollama_chat/ministral-3:8bvia Ollamalocalhost:11434Steps to Reproduce
/runwith both text and image parts:{ "appName": "default", "userId": "user", "sessionId": "test", "newMessage": { "role": "user", "parts": [ {"text": "What is in this image?"}, { "inlineData": { "mimeType": "image/png", "data": "<base64-encoded-image>" } } ] }, "streaming": false }Debug Evidence
With
it can be seen that the outgoing request to Ollama shows the image array is empty despite being present in the
/runpayload:The
inlineDatabase64 bytes are never populated into theimagesfield.Expected Behavior
The
inlineDatapart from the/runpayload should be converted and forwarded as base64 image data in the outgoing LLM request.Actual Behavior
The
imagesfield is sent as an empty array[]. The model responds as if no image was attached.Additional Notes
The same model (
ministral-3:8b) correctly processes images when called directly via the Ollama native UI or via the Ollama Python SDK, confirming the model itself supports vision. The issue is specific to ADK's message conversion layer between the/runAPI and the LiteLLM call.