Skip to content

Conversation

@jakesteelman-insurica
Copy link

@jakesteelman-insurica jakesteelman-insurica commented Oct 7, 2025

This PR will add an extra condition to check for chunks that have token usage information but do not have choices or delta properties. I opened #198, then was able to trace the cause down to GPT-5 sending a chunk at the end that was not handled by any branch of the if/elif statements since it did not have chunk.choices or chunk.delta.

Here's an example of the last chunk from calling llm.stream("..."), if you add a print('Chunk Received:', chunk) statement on line 635 of chat_models.py

from databricks-langchain import ChatDatabricks

llm = ChatDatabricks(endpoint="gpt-5")
llm.stream("hello")
Chunk Received: ChatCompletionChunk(id='chatcmpl-CO5Nwd3B1mZzhy32nob4TgRGSOYnY', choices=[Choice(delta=ChoiceDelta(content='Hello', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1759856608, model='gpt-5-2025-08-07', object='chat.completion.chunk', service_tier='default', system_fingerprint=None, usage=None, obfuscation='beO2Lti')
Chunk Received: ChatCompletionChunk(id='chatcmpl-CO5Nwd3B1mZzhy32nob4TgRGSOYnY', choices=[Choice(delta=ChoiceDelta(content=' ,', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1759856608, model='gpt-5-2025-08-07', object='chat.completion.chunk', service_tier='default', system_fingerprint=None, usage=None, obfuscation='c5xtxoC')
...
Chunk Received: ChatCompletionChunk(id='chatcmpl-CO5Nwd3B1mZzhy32nob4TgRGSOYnY', choices=[], created=1759856608, model='gpt-5-2025-08-07', object='chat.completion.chunk', service_tier='default', system_fingerprint=None, usage=CompletionUsage(completion_tokens=267, prompt_tokens=4861, total_tokens=5128, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=192, rejected_prediction_tokens=0), prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0)), obfuscation='wts2')

@jakesteelman-insurica jakesteelman-insurica marked this pull request as draft October 7, 2025 17:46
@jakesteelman-insurica jakesteelman-insurica changed the title add a check for non-delta non-choices usage chunk Fix missing token usage from some external model endpoints Oct 7, 2025
@jakesteelman-insurica jakesteelman-insurica marked this pull request as ready for review October 7, 2025 17:58
@WIll-Xu35
Copy link

Facing the same issue here

Copy link
Contributor

@sunishsheth2009 sunishsheth2009 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution here and helping to fix this issue

logprobs=generation_info.get("logprobs"),
)
yield generation_chunk
elif chunk.usage and stream_usage:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add some test here: integrations/langchain/tests/unit_tests/test_chat_models.py for this where the last chunk is usage data and those are added to the response correctly?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants