feat: add resp format, timeout, prompt caching, & usage to conversation api #4129

sicoyle · 2025-12-20T00:02:41Z

Description

This PR adds:

timeout support for llm providers. As long as the context timeout is set, langchain supports it.
prompt caching using metadata passed to the llm providers. This is a direct work around since langchain has a field of llms.WIthPromptCaching(true) that you can set, but setting that and testing against the conformance tests yields an err bc then it sets a bool but openai (and therefore all the providers using openai under the hood in contrib) fail bc it expects this to be a time duration string.
I then unfortunately had to add usage metrics in order to actually test my prompt cache field addition workaround. This required a lot of data type translations for langchain.
response format field - this is something that users rely on. In dapr agents we do a workaround for this that I want to rm with this addition.
I also renamed CacheTTL to ResponseCacheTTL since it is not prompt caching and should be explicitly showing that. I support the old json tag still.
i also had to bump langchain versions for some of the new options to be supported (even with the workarounds)

I need this PR merged first to then merge my dapr/dapr one.

I tested conformance test on: mistral, openai, anthropic.
I have a few fixes left for ollama, and aws (this one was broken to begin with...).

Issue reference

We strive to have all PR being opened based on an issue, where the problem or feature have been discussed prior to implementation.

Please reference the issue this PR will close: #[issue number]

Checklist

Please make sure you've completed the relevant tasks for this PR, out of the following list:

Code compiles correctly
Created/updated tests
Extended the documentation
- Created the dapr/docs PR:

Note: We expect contributors to open a corresponding documentation PR in the dapr/docs repository. As the implementer, you are the best person to document your work! Implementation PRs will not be merged until the documentation PR is opened and ready for review.

…che works) for convo api Signed-off-by: Samantha Coyle <[email protected]>

CasperGN

@sicoyle does this (or the linked pr) also carry bubbling out the model, tokens etc into the metadata api like we discussed?

CasperGN · 2025-12-22T12:12:24Z

conversation/echo/echo.go

+	if textLength == 0 {
+		return 0
+	}
+	return int64((textLength + 3) / 4)


Looking over the OpenAI example, would it be better to iterate over the words in the input instead?

feat: add timeout, prompt caching, & usage (need to confirm prompt ca…

433c1b1

…che works) for convo api Signed-off-by: Samantha Coyle <[email protected]>

sicoyle requested review from a team as code owners December 20, 2025 00:02

CasperGN reviewed Dec 22, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add resp format, timeout, prompt caching, & usage to conversation api #4129

feat: add resp format, timeout, prompt caching, & usage to conversation api #4129

sicoyle commented Dec 20, 2025

Uh oh!

CasperGN left a comment

Uh oh!

CasperGN Dec 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: add resp format, timeout, prompt caching, & usage to conversation api #4129

Are you sure you want to change the base?

feat: add resp format, timeout, prompt caching, & usage to conversation api #4129

Conversation

sicoyle commented Dec 20, 2025

Description

Issue reference

Checklist

Uh oh!

CasperGN left a comment

Choose a reason for hiding this comment

Uh oh!

CasperGN Dec 22, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants