Skip to content

Conversation

@johnbean393
Copy link

Description

This PR adds support for the new grok-4-fast model via OpenRouter through the bytebot-llm-proxy (LiteLLM).

Changes

  • Removed max_tokens parameter from proxy service Chat Completion requests
  • Removed reasoning_effort parameter from proxy service Chat Completion requests

These model-specific parameters were causing compatibility issues with the grok-4-fast model. By removing them, the proxy service now works seamlessly with grok-4-fast and other models that don't support these parameters, while LiteLLM handles model-specific parameter mapping automatically.

Testing

  • Verified that grok-4-fast model works correctly through the LiteLLM proxy
  • Confirmed backward compatibility with existing models

Remove max_tokens and reasoning_effort parameters from proxy service
to improve compatibility with grok-4-fast model through OpenRouter.
These model-specific parameters were causing issues with the new model.
- Add proxy.model-info.ts to dynamically fetch context windows from OpenRouter API
- Update tasks.controller.ts to use async extractContextWindow function
- Replace hardcoded 128K context window with dynamic values from OpenRouter
- Implement caching layer (1-hour TTL) to minimize API calls
- Fix Dockerfile to properly handle Prisma in Alpine Linux

Benefits:
- Grok 4 Fast now correctly reports 2M token context window
- Claude Sonnet 4.5 reports 1M tokens instead of 200K
- Gemini 2.5 models report 1048576 tokens
- All models automatically get accurate, up-to-date context windows
- Improves agent performance by preventing premature summarization

Fixes context window inaccuracies by prioritizing:
1. LiteLLM model_info (when available)
2. OpenRouter API context_length (when LiteLLM returns null)
3. Default fallback (128K)

Related to Grok 4 Fast support
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant