Description
When using an OpenAI-compatible provider with a custom base_url (e.g., self-hosted vLLM, Ollama, or any non-OpenAI endpoint), CrewAI's Converter/InternalInstructor silently discards the base_url and hits api.openai.com instead.
The root cause is in internal_instructor.py — _create_instructor_client() extracts only the model name and provider string from self.llm.model, then calls:
return instructor.from_provider(f"{provider}/{model_string}")
The base_url from the LLM object is never forwarded. instructor.from_provider() creates a fresh OpenAI client defaulting to api.openai.com/v1/.
Note: There is a _get_llm_extra_kwargs() method that forwards base_url, but it is guarded behind is_litellm=True, so non-LiteLLM OpenAI-compatible providers are still affected.
Steps to Reproduce
- Configure a CrewAI agent with an OpenAI-compatible LLM that has a custom
base_url (e.g., vLLM, Ollama, or any self-hosted endpoint)
- Set
output_pydantic or output_json on a Task so that CrewAI invokes the Converter
- The Converter creates an
InternalInstructor via converter.py:145-152, passing the full LLM object (which has base_url set)
InternalInstructor._create_instructor_client() (internal_instructor.py:76-101) extracts only self.llm.model and self.llm.provider, then calls instructor.from_provider(f"{provider}/{model_string}") — base_url is lost
- The instructor client sends requests to
api.openai.com instead of the configured endpoint
- Result:
ConnectTimeout / ConverterError in environments that cannot reach api.openai.com
Expected behavior
When a LLM is configured with a custom base_url, the InternalInstructor should forward that base_url to the instructor client so that structured output parsing requests go to the correct endpoint — not to api.openai.com.
Screenshots/Code snippets
The buggy code path (internal_instructor.py:76-101):
def _create_instructor_client(self) -> Any:
import instructor
if isinstance(self.llm, str):
model_string = self.llm
elif self.llm is not None and hasattr(self.llm, "model"):
model_string = self.llm.model # ← extracts model name
else:
raise ValueError(...)
if isinstance(self.llm, str):
provider = self._extract_provider()
elif self.llm is not None and hasattr(self.llm, "provider"):
provider = self.llm.provider # ← extracts provider
else:
provider = "openai"
# ← base_url is NEVER forwarded here
return instructor.from_provider(f"{provider}/{model_string}")
The incomplete fix — _get_llm_extra_kwargs() exists but is guarded:
def _get_llm_extra_kwargs(self) -> dict[str, Any]:
# This guard means non-litellm providers never get base_url forwarded
if not getattr(self.llm, "is_litellm", False):
return {} # ← base_url lost for OpenAI-compatible providers
extra = {}
for attr in ("api_base", "base_url", "api_key"):
value = getattr(self.llm, attr, None)
if value is not None:
extra[attr] = value
return extra
Operating System
macOS Sonoma
Python Version
3.12
crewAI Version
latest (main branch)
crewAI Tools Version
latest
Virtual Environment
Venv
Evidence
Traced through the source code:
converter.py:145-152 — _create_instructor() correctly passes the full LLM object (with base_url):
def _create_instructor(self):
return InternalInstructor(
llm=self.llm, # ← has base_url set
model=self.model,
content=self.text,
)
internal_instructor.py:76-101 — _create_instructor_client() discards base_url:
model_string = self.llm.model # only extracts model name
provider = self.llm.provider # only extracts provider
return instructor.from_provider(f"{provider}/{model_string}") # base_url lost
base_llm.py:123 — confirms base_url is a defined field on BaseLLM:
base_url: str | None = None
_get_llm_extra_kwargs() exists but is guarded by is_litellm:
if not getattr(self.llm, "is_litellm", False):
return {} # non-litellm providers never get base_url forwarded
- Result in production: All structured output requests (
output_pydantic, output_json) from OpenAI-compatible providers hit api.openai.com → ConnectTimeout → ConverterError.
Possible Solution
Two options:
Option A (minimal fix): In _create_instructor_client(), pass base_url to instructor.from_provider() if the instructor library supports it, or construct an explicit OpenAI(base_url=...) client and use instructor.from_openai(client) instead.
Option B (broader fix): Remove the is_litellm guard from _get_llm_extra_kwargs() so that base_url and api_key are forwarded for all provider types, not just LiteLLM-backed ones. The guard was added because "non-litellm instructor clients (from_provider) don't accept them" — but the fix should be to use a client constructor that does accept them (e.g., instructor.from_openai(OpenAI(base_url=..., api_key=...))) rather than silently dropping the config.
Additional context
This affects anyone using OpenAI-compatible endpoints (vLLM, Ollama remote, Azure OpenAI with custom endpoints, etc.) with structured output tasks (output_pydantic or output_json). The agent's main LLM calls work fine because they go through the LLM class directly, but the Converter/InternalInstructor path bypasses the LLM and creates its own client — losing the base_url in the process.
There are partial fixes on feature branches (commits 9dabb3e, 9bdc7b9 for LiteLLM, and f9ae6c5 for A2A) but none address the non-LiteLLM OpenAI-compatible path on main.
Description
When using an OpenAI-compatible provider with a custom
base_url(e.g., self-hosted vLLM, Ollama, or any non-OpenAI endpoint), CrewAI's Converter/InternalInstructor silently discards thebase_urland hitsapi.openai.cominstead.The root cause is in
internal_instructor.py—_create_instructor_client()extracts only the model name and provider string fromself.llm.model, then calls:The
base_urlfrom the LLM object is never forwarded.instructor.from_provider()creates a fresh OpenAI client defaulting toapi.openai.com/v1/.Note: There is a
_get_llm_extra_kwargs()method that forwardsbase_url, but it is guarded behindis_litellm=True, so non-LiteLLM OpenAI-compatible providers are still affected.Steps to Reproduce
base_url(e.g., vLLM, Ollama, or any self-hosted endpoint)output_pydanticoroutput_jsonon a Task so that CrewAI invokes the ConverterInternalInstructorviaconverter.py:145-152, passing the full LLM object (which hasbase_urlset)InternalInstructor._create_instructor_client()(internal_instructor.py:76-101) extracts onlyself.llm.modelandself.llm.provider, then callsinstructor.from_provider(f"{provider}/{model_string}")—base_urlis lostapi.openai.cominstead of the configured endpointConnectTimeout/ConverterErrorin environments that cannot reachapi.openai.comExpected behavior
When a LLM is configured with a custom
base_url, the InternalInstructor should forward thatbase_urlto the instructor client so that structured output parsing requests go to the correct endpoint — not toapi.openai.com.Screenshots/Code snippets
The buggy code path (
internal_instructor.py:76-101):The incomplete fix —
_get_llm_extra_kwargs()exists but is guarded:Operating System
macOS Sonoma
Python Version
3.12
crewAI Version
latest (main branch)
crewAI Tools Version
latest
Virtual Environment
Venv
Evidence
Traced through the source code:
converter.py:145-152—_create_instructor()correctly passes the full LLM object (withbase_url):internal_instructor.py:76-101—_create_instructor_client()discardsbase_url:base_llm.py:123— confirmsbase_urlis a defined field onBaseLLM:_get_llm_extra_kwargs()exists but is guarded byis_litellm:output_pydantic,output_json) from OpenAI-compatible providers hitapi.openai.com→ConnectTimeout→ConverterError.Possible Solution
Two options:
Option A (minimal fix): In
_create_instructor_client(), passbase_urltoinstructor.from_provider()if the instructor library supports it, or construct an explicitOpenAI(base_url=...)client and useinstructor.from_openai(client)instead.Option B (broader fix): Remove the
is_litellmguard from_get_llm_extra_kwargs()so thatbase_urlandapi_keyare forwarded for all provider types, not just LiteLLM-backed ones. The guard was added because "non-litellm instructor clients (from_provider) don't accept them" — but the fix should be to use a client constructor that does accept them (e.g.,instructor.from_openai(OpenAI(base_url=..., api_key=...))) rather than silently dropping the config.Additional context
This affects anyone using OpenAI-compatible endpoints (vLLM, Ollama remote, Azure OpenAI with custom endpoints, etc.) with structured output tasks (
output_pydanticoroutput_json). The agent's main LLM calls work fine because they go through the LLM class directly, but the Converter/InternalInstructor path bypasses the LLM and creates its own client — losing thebase_urlin the process.There are partial fixes on feature branches (commits 9dabb3e, 9bdc7b9 for LiteLLM, and f9ae6c5 for A2A) but none address the non-LiteLLM OpenAI-compatible path on main.