Skip to content

Commit cfcd777

Browse files
committed
refactor: move enable_structured_output_with_tools to LitellmModel
Moved the enable_structured_output_with_tools parameter from the Agent class to LitellmModel.__init__() to minimize the diff and isolate changes within the LiteLLM adapter as requested during code review. Changes: - Added enable_structured_output_with_tools parameter to LitellmModel.__init__() - Stored as instance variable and used throughout LitellmModel - Removed parameter from Agent class and related validation - Removed parameter from Model interface (get_response / stream_response) - Removed parameter from Runner (no longer passed to model calls) - Removed parameter from OpenAI model implementations - Reverted test mock models to original signatures - Updated test_gemini_local.py for model-level configuration - Updated documentation to reflect model-level usage Before: Agent(model=..., enable_structured_output_with_tools=True) After: Agent(model=LitellmModel(..., enable_structured_output_with_tools=True))
1 parent 6c637fd commit cfcd777

File tree

14 files changed

+42
-96
lines changed

14 files changed

+42
-96
lines changed

docs/agents.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -81,14 +81,16 @@ from agents.extensions.models.litellm_model import LitellmModel
8181

8282
agent = Agent(
8383
name="Weather assistant",
84-
model=LitellmModel("gemini/gemini-1.5-flash"),
84+
model=LitellmModel(
85+
"gemini/gemini-2.5-flash",
86+
enable_structured_output_with_tools=True, # Required for Gemini
87+
),
8588
tools=[get_weather],
8689
output_type=WeatherReport,
87-
enable_structured_output_with_tools=True, # Required for Gemini
8890
)
8991
```
9092

91-
The `enable_structured_output_with_tools` parameter injects JSON formatting instructions into the system prompt as a workaround. This is only needed for models accessed via [`LitellmModel`][agents.extensions.models.litellm_model.LitellmModel] that lack native support. OpenAI models ignore this parameter.
93+
The `enable_structured_output_with_tools` parameter on [`LitellmModel`][agents.extensions.models.litellm_model.LitellmModel] injects JSON formatting instructions into the system prompt as a workaround. This is only needed for models that lack native support for using tools and structured outputs simultaneously (like Gemini).
9294

9395
See the [prompt injection documentation](models/structured_output_with_tools.md) for more details.
9496

docs/models/litellm.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -111,13 +111,15 @@ def analyze_data(query: str) -> dict:
111111

112112
agent = Agent(
113113
name="Analyst",
114-
model=LitellmModel("gemini/gemini-1.5-flash"),
114+
model=LitellmModel(
115+
"gemini/gemini-2.5-flash",
116+
enable_structured_output_with_tools=True, # Required for Gemini
117+
),
115118
tools=[analyze_data],
116119
output_type=Report,
117-
enable_structured_output_with_tools=True, # Required for Gemini
118120
)
119121
```
120122

121-
The `enable_structured_output_with_tools` parameter enables a workaround that injects JSON formatting instructions into the system prompt instead of using the native API. This allows models like Gemini to return structured outputs even when using tools.
123+
The `enable_structured_output_with_tools` parameter on `LitellmModel` enables a workaround that injects JSON formatting instructions into the system prompt instead of using the native API. This allows models like Gemini to return structured outputs even when using tools.
122124

123125
See the [prompt injection documentation](structured_output_with_tools.md) for complete details.

docs/models/structured_output_with_tools.md

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ def get_weather(city: str) -> dict:
2525

2626
# This causes an error with Gemini
2727
agent = Agent(
28-
model=LitellmModel("gemini/gemini-1.5-flash"),
28+
model=LitellmModel("gemini/gemini-2.5-flash"),
2929
tools=[get_weather],
3030
output_type=WeatherReport, # Error: can't use both!
3131
)
@@ -40,14 +40,16 @@ GeminiException BadRequestError - Function calling with a response mime type
4040

4141
## The Solution
4242

43-
Enable prompt injection by setting `enable_structured_output_with_tools=True` on your agent:
43+
Enable prompt injection by setting `enable_structured_output_with_tools=True` on the `LitellmModel`:
4444

4545
```python
4646
agent = Agent(
47-
model=LitellmModel("gemini/gemini-1.5-flash"),
47+
model=LitellmModel(
48+
"gemini/gemini-2.5-flash",
49+
enable_structured_output_with_tools=True, # ← Enables the workaround
50+
),
4851
tools=[get_weather],
4952
output_type=WeatherReport,
50-
enable_structured_output_with_tools=True, # ← Enables the workaround
5153
)
5254
```
5355

@@ -90,10 +92,12 @@ async def main():
9092
agent = Agent(
9193
name="WeatherBot",
9294
instructions="Use the get_weather tool, then provide a structured report.",
93-
model=LitellmModel("gemini/gemini-1.5-flash"),
95+
model=LitellmModel(
96+
"gemini/gemini-2.5-flash",
97+
enable_structured_output_with_tools=True, # Required for Gemini
98+
),
9499
tools=[get_weather],
95100
output_type=WeatherReport,
96-
enable_structured_output_with_tools=True, # Required for Gemini
97101
)
98102

99103
result = await Runner.run(agent, "What's the weather in Tokyo?")

src/agents/agent.py

Lines changed: 0 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -231,16 +231,6 @@ class Agent(AgentBase, Generic[TContext]):
231231
"""Whether to reset the tool choice to the default value after a tool has been called. Defaults
232232
to True. This ensures that the agent doesn't enter an infinite loop of tool usage."""
233233

234-
enable_structured_output_with_tools: bool = False
235-
"""Enable structured outputs when using tools on models that don't natively support both
236-
simultaneously (e.g., Gemini). When enabled, injects JSON formatting instructions into the
237-
system prompt as a workaround instead of using the native API. Defaults to False (use native
238-
API support when available).
239-
240-
Set to True when using models that don't support both features natively (e.g., Gemini via
241-
LiteLLM).
242-
"""
243-
244234
def __post_init__(self):
245235
from typing import get_origin
246236

@@ -374,12 +364,6 @@ def __post_init__(self):
374364
f"got {type(self.reset_tool_choice).__name__}"
375365
)
376366

377-
if not isinstance(self.enable_structured_output_with_tools, bool):
378-
raise TypeError(
379-
f"Agent enable_structured_output_with_tools must be a boolean, "
380-
f"got {type(self.enable_structured_output_with_tools).__name__}"
381-
)
382-
383367
def clone(self, **kwargs: Any) -> Agent[TContext]:
384368
"""Make a copy of the agent, with the given arguments changed.
385369
Notes:

src/agents/extensions/models/litellm_model.py

Lines changed: 5 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -74,10 +74,12 @@ def __init__(
7474
model: str,
7575
base_url: str | None = None,
7676
api_key: str | None = None,
77+
enable_structured_output_with_tools: bool = False,
7778
):
7879
self.model = model
7980
self.base_url = base_url
8081
self.api_key = api_key
82+
self.enable_structured_output_with_tools = enable_structured_output_with_tools
8183

8284
async def get_response(
8385
self,
@@ -89,9 +91,8 @@ async def get_response(
8991
handoffs: list[Handoff],
9092
tracing: ModelTracing,
9193
previous_response_id: str | None = None, # unused
92-
conversation_id: str | None = None, # unused
94+
conversation_id: str | None = None,
9395
prompt: Any | None = None,
94-
enable_structured_output_with_tools: bool = False,
9596
) -> ModelResponse:
9697
with generation_span(
9798
model=str(self.model),
@@ -110,7 +111,6 @@ async def get_response(
110111
tracing,
111112
stream=False,
112113
prompt=prompt,
113-
enable_structured_output_with_tools=enable_structured_output_with_tools,
114114
)
115115

116116
message: litellm.types.utils.Message | None = None
@@ -195,9 +195,8 @@ async def stream_response(
195195
handoffs: list[Handoff],
196196
tracing: ModelTracing,
197197
previous_response_id: str | None = None, # unused
198-
conversation_id: str | None = None, # unused
198+
conversation_id: str | None = None,
199199
prompt: Any | None = None,
200-
enable_structured_output_with_tools: bool = False,
201200
) -> AsyncIterator[TResponseStreamEvent]:
202201
with generation_span(
203202
model=str(self.model),
@@ -216,7 +215,6 @@ async def stream_response(
216215
tracing,
217216
stream=True,
218217
prompt=prompt,
219-
enable_structured_output_with_tools=enable_structured_output_with_tools,
220218
)
221219

222220
final_response: Response | None = None
@@ -248,7 +246,6 @@ async def _fetch_response(
248246
tracing: ModelTracing,
249247
stream: Literal[True],
250248
prompt: Any | None = None,
251-
enable_structured_output_with_tools: bool = False,
252249
) -> tuple[Response, AsyncStream[ChatCompletionChunk]]: ...
253250

254251
@overload
@@ -264,7 +261,6 @@ async def _fetch_response(
264261
tracing: ModelTracing,
265262
stream: Literal[False],
266263
prompt: Any | None = None,
267-
enable_structured_output_with_tools: bool = False,
268264
) -> litellm.types.utils.ModelResponse: ...
269265

270266
async def _fetch_response(
@@ -279,7 +275,6 @@ async def _fetch_response(
279275
tracing: ModelTracing,
280276
stream: bool = False,
281277
prompt: Any | None = None,
282-
enable_structured_output_with_tools: bool = False,
283278
) -> litellm.types.utils.ModelResponse | tuple[Response, AsyncStream[ChatCompletionChunk]]:
284279
# Preserve reasoning messages for tool calls when reasoning is on
285280
# This is needed for models like Claude 4 Sonnet/Opus which support interleaved thinking
@@ -298,7 +293,7 @@ async def _fetch_response(
298293
# Check if we need to inject JSON output prompt for models that don't support
299294
# tools + structured output simultaneously (like Gemini)
300295
inject_json_prompt = should_inject_json_prompt(
301-
output_schema, tools, enable_structured_output_with_tools
296+
output_schema, tools, self.enable_structured_output_with_tools
302297
)
303298
if inject_json_prompt and output_schema:
304299
json_prompt = get_json_output_prompt(output_schema)

src/agents/models/interface.py

Lines changed: 0 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,6 @@ async def get_response(
5050
previous_response_id: str | None,
5151
conversation_id: str | None,
5252
prompt: ResponsePromptParam | None,
53-
enable_structured_output_with_tools: bool = False,
5453
) -> ModelResponse:
5554
"""Get a response from the model.
5655
@@ -66,9 +65,6 @@ async def get_response(
6665
except for the OpenAI Responses API.
6766
conversation_id: The ID of the stored conversation, if any.
6867
prompt: The prompt config to use for the model.
69-
enable_structured_output_with_tools: Whether to inject JSON formatting instructions
70-
into the system prompt when using structured outputs with tools. Required for
71-
models that don't support both features natively (like Gemini).
7268
7369
Returns:
7470
The full model response.
@@ -89,7 +85,6 @@ def stream_response(
8985
previous_response_id: str | None,
9086
conversation_id: str | None,
9187
prompt: ResponsePromptParam | None,
92-
enable_structured_output_with_tools: bool = False,
9388
) -> AsyncIterator[TResponseStreamEvent]:
9489
"""Stream a response from the model.
9590
@@ -105,9 +100,6 @@ def stream_response(
105100
except for the OpenAI Responses API.
106101
conversation_id: The ID of the stored conversation, if any.
107102
prompt: The prompt config to use for the model.
108-
enable_structured_output_with_tools: Whether to inject JSON formatting instructions
109-
into the system prompt when using structured outputs with tools. Required for
110-
models that don't support both features natively (like Gemini).
111103
112104
Returns:
113105
An iterator of response stream events, in OpenAI Responses format.

src/agents/models/openai_chatcompletions.py

Lines changed: 0 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -59,7 +59,6 @@ async def get_response(
5959
previous_response_id: str | None = None, # unused
6060
conversation_id: str | None = None, # unused
6161
prompt: ResponsePromptParam | None = None,
62-
enable_structured_output_with_tools: bool = False,
6362
) -> ModelResponse:
6463
with generation_span(
6564
model=str(self.model),
@@ -77,7 +76,6 @@ async def get_response(
7776
tracing,
7877
stream=False,
7978
prompt=prompt,
80-
enable_structured_output_with_tools=enable_structured_output_with_tools,
8179
)
8280

8381
message: ChatCompletionMessage | None = None
@@ -149,7 +147,6 @@ async def stream_response(
149147
previous_response_id: str | None = None, # unused
150148
conversation_id: str | None = None, # unused
151149
prompt: ResponsePromptParam | None = None,
152-
enable_structured_output_with_tools: bool = False,
153150
) -> AsyncIterator[TResponseStreamEvent]:
154151
"""
155152
Yields a partial message as it is generated, as well as the usage information.
@@ -170,7 +167,6 @@ async def stream_response(
170167
tracing,
171168
stream=True,
172169
prompt=prompt,
173-
enable_structured_output_with_tools=enable_structured_output_with_tools,
174170
)
175171

176172
final_response: Response | None = None
@@ -202,7 +198,6 @@ async def _fetch_response(
202198
tracing: ModelTracing,
203199
stream: Literal[True],
204200
prompt: ResponsePromptParam | None = None,
205-
enable_structured_output_with_tools: bool = False,
206201
) -> tuple[Response, AsyncStream[ChatCompletionChunk]]: ...
207202

208203
@overload
@@ -218,7 +213,6 @@ async def _fetch_response(
218213
tracing: ModelTracing,
219214
stream: Literal[False],
220215
prompt: ResponsePromptParam | None = None,
221-
enable_structured_output_with_tools: bool = False,
222216
) -> ChatCompletion: ...
223217

224218
async def _fetch_response(
@@ -233,12 +227,7 @@ async def _fetch_response(
233227
tracing: ModelTracing,
234228
stream: bool = False,
235229
prompt: ResponsePromptParam | None = None,
236-
enable_structured_output_with_tools: bool = False,
237230
) -> ChatCompletion | tuple[Response, AsyncStream[ChatCompletionChunk]]:
238-
# Note: enable_structured_output_with_tools parameter is accepted for interface consistency
239-
# but not used for OpenAI models since they have native support for
240-
# tools + structured outputs simultaneously
241-
242231
converted_messages = Converter.items_to_messages(input)
243232

244233
if system_instructions:

src/agents/models/openai_responses.py

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -84,7 +84,6 @@ async def get_response(
8484
previous_response_id: str | None = None,
8585
conversation_id: str | None = None,
8686
prompt: ResponsePromptParam | None = None,
87-
enable_structured_output_with_tools: bool = False,
8887
) -> ModelResponse:
8988
with response_span(disabled=tracing.is_disabled()) as span_response:
9089
try:
@@ -162,7 +161,6 @@ async def stream_response(
162161
previous_response_id: str | None = None,
163162
conversation_id: str | None = None,
164163
prompt: ResponsePromptParam | None = None,
165-
enable_structured_output_with_tools: bool = False,
166164
) -> AsyncIterator[ResponseStreamEvent]:
167165
"""
168166
Yields a partial message as it is generated, as well as the usage information.

src/agents/run.py

Lines changed: 6 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -1287,18 +1287,6 @@ async def _run_single_turn_streamed(
12871287
)
12881288

12891289
# 1. Stream the output events
1290-
# Build kwargs for model call
1291-
model_kwargs: dict[str, Any] = {
1292-
"previous_response_id": previous_response_id,
1293-
"conversation_id": conversation_id,
1294-
"prompt": prompt_config,
1295-
}
1296-
1297-
# Only pass enable_structured_output_with_tools when enabled
1298-
# to maintain backward compatibility with third-party Model implementations
1299-
if agent.enable_structured_output_with_tools:
1300-
model_kwargs["enable_structured_output_with_tools"] = True
1301-
13021290
async for event in model.stream_response(
13031291
filtered.instructions,
13041292
filtered.input,
@@ -1309,7 +1297,9 @@ async def _run_single_turn_streamed(
13091297
get_model_tracing_impl(
13101298
run_config.tracing_disabled, run_config.trace_include_sensitive_data
13111299
),
1312-
**model_kwargs,
1300+
previous_response_id=previous_response_id,
1301+
conversation_id=conversation_id,
1302+
prompt=prompt_config,
13131303
):
13141304
# Emit the raw event ASAP
13151305
streamed_result._event_queue.put_nowait(RawResponsesStreamEvent(data=event))
@@ -1732,18 +1722,6 @@ async def _get_new_response(
17321722
server_conversation_tracker.conversation_id if server_conversation_tracker else None
17331723
)
17341724

1735-
# Build kwargs for model call
1736-
model_kwargs: dict[str, Any] = {
1737-
"previous_response_id": previous_response_id,
1738-
"conversation_id": conversation_id,
1739-
"prompt": prompt_config,
1740-
}
1741-
1742-
# Only pass enable_structured_output_with_tools when enabled
1743-
# to maintain backward compatibility with third-party Model implementations
1744-
if agent.enable_structured_output_with_tools:
1745-
model_kwargs["enable_structured_output_with_tools"] = True
1746-
17471725
new_response = await model.get_response(
17481726
system_instructions=filtered.instructions,
17491727
input=filtered.input,
@@ -1754,7 +1732,9 @@ async def _get_new_response(
17541732
tracing=get_model_tracing_impl(
17551733
run_config.tracing_disabled, run_config.trace_include_sensitive_data
17561734
),
1757-
**model_kwargs,
1735+
previous_response_id=previous_response_id,
1736+
conversation_id=conversation_id,
1737+
prompt=prompt_config,
17581738
)
17591739

17601740
context_wrapper.usage.add(new_response.usage)

tests/fake_model.py

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -90,7 +90,6 @@ async def get_response(
9090
previous_response_id: str | None,
9191
conversation_id: str | None,
9292
prompt: Any | None,
93-
enable_structured_output_with_tools: bool = False,
9493
) -> ModelResponse:
9594
turn_args = {
9695
"system_instructions": system_instructions,
@@ -141,7 +140,6 @@ async def stream_response(
141140
previous_response_id: str | None = None,
142141
conversation_id: str | None = None,
143142
prompt: Any | None = None,
144-
enable_structured_output_with_tools: bool = False,
145143
) -> AsyncIterator[TResponseStreamEvent]:
146144
turn_args = {
147145
"system_instructions": system_instructions,

0 commit comments

Comments
 (0)