Skip to content

Conversation

@jonathanortega2023
Copy link

@jonathanortega2023 jonathanortega2023 commented Oct 21, 2025

Pull Request Type

  • 🐛 fix

Relevant Issues

resolves #4567

What is in this change?

What is in this change?

  • Updated LLMPerformanceMonitor to support provider duration timing for more accurate output TPS calculations
  • measureAsyncFunction accepts duration from output.usage.duration
  • Modified measureStream to accept duration from reportedUsage.duration if provided by the LLM provider
  • Updated Ollama provider to use eval_duration for more accurate TPS calculation

This is a generic enhancement that any provider can leverage to supply more accurate timing information, while maintaining backward compatibility with providers that don't supply duration metrics.

Developer Validations

  • I ran yarn lint from the root of the repo & committed changes
  • Relevant documentation has been updated
  • I have tested my code functionality
  • Docker build succeeds locally

Copy link
Collaborator

@shatfield4 shatfield4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, refactored to work with streaming and utilizes LLMPerformanceMonitor.

@shatfield4 shatfield4 added PR:needs review Needs review by core team and removed blocked labels Oct 22, 2025
@shatfield4 shatfield4 changed the title fix: Use eval_duration for output TPS calculations and add as a metri… fix: Use eval_duration for output TPS calculations in Ollama LLM provider Oct 22, 2025
Copy link
Member

@timothycarambat timothycarambat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR does not address duration timing in measureAsyncFunction and if we are going to fix it, we should do so in both places.

We should also either have duration be an override-able property in the PerformanceMontior or just change it specifically in Ollama's AI provider so its impact is limited.

If we make changes to the entire utility then we cannot be sure what the side-effects are.

Since we are going to override eval_duration it should be a custom property name so that it does not get confused with the actual value from ollama if we ever want to try to use it in another place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

PR:needs review Needs review by core team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG]: Inaccurate duration metric from Ollama

3 participants