Rate limit gemini based on different metric #383

AbhimanyuAryan · 2025-10-27T22:41:51Z

Feat: Rate limit for Gemini models

Valid types:

fix: - Bug fixes
feat: - New features
breaking: - Breaking changes
docs: - Documentation updates
refactor: - Code refactoring
test: - Test additions/modifications
chore: - Maintenance tasks
perf: - Performance improvements
style: - Code style changes
ci: - CI/CD configuration changes

Examples:

fix: resolve memory leak in data processing
feat: add export to CSV functionality
breaking: change API response format
docs: update installation guide

Description

Brief description of the changes in this PR

Type of change

Checklist

I have run pre-commit on my changed files and all checks pass
My code follows the style guidelines of this project
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

Pre-commit status

# Paste the output of running pre-commit on your changed files:
# uv run pre-commit install
# git diff --name-only HEAD~1 | xargs uv run pre-commit run --files # for last commit
# git diff --name-only origin/<base branch>...HEAD | xargs uv run pre-commit run --files # for all commits in PR
# git add <your file> # if any fixes were applied
# git commit -m "chore: apply pre-commit fixes"
# git push origin <branch-name>

How to Test

Add test method for this PR.

Test CLI Command

Write down the test bash command. If there is pre-requests, please emphasize.

massgen --config massgen/configs/basic/multi/seven_gemini_agents.yaml --rate-limit  "France has had a lot of capital in past. Which one is the most popular and why?"

Expected Results

Additional context

The goal is to implement rate limiter logic for any backend model, so it doesn’t make endless requests and spam the API.

- Added configurable rate limiting system with RPM (requests/min), TPM (tokens/min), and RPD (requests/day) limits - Created external YAML configuration for rate limits with conservative defaults for Gemini models - Enhanced GlobalRateLimiter with new MultiRateLimiter class using sliding windows for precise limit tracking - Integrated rate limiting into GeminiBackend with automatic token usage tracking - Added comprehensive documentation

- Moved rate limits to centralized configuration file instead of hardcoding in orchestrator - Added mandatory cooldown periods after agent startup to prevent API burst calls - Implemented _load_rate_limits_from_config() to dynamically load and validate rate limits - Added conservative rate limiting for low-RPM models (<=2 RPM) - Added detailed logging for rate limit configuration and cooldown periods - Added fallback defaults if config loading fails

- Removed three deprecated markdown files related to rate limiting documentation: - CHANGELOG_RATE_LIMITING.md - QUICK_START_RATE_LIMITING.md - RATE_LIMITING_COMPLETE.md - These files contained old implementation details that have since been superseded by newer documentation in the main docs directory

- Added separate rate limits for Gemini Flash (9 RPM) and Pro (2 RPM) models - Removed shared 7 RPM limit to prevent rate limit errors - Updated orchestrator startup timing to handle model-specific limits - Added documentation explaining rate limit changes and performance implications - Included recommendations for optimizing agent configurations based on rate limits - Added test instructions and example logs for verification

- Added enable_rate_limit flag to control agent startup rate limiting behavior - Modified Orchestrator to skip rate limit checks when feature is disabled - Updated CLI to pass rate limit configuration through to Orchestrator - Added documentation for new enable_rate_limit parameter The changes introduce an opt-in rate limiting system for agent startup, allowing users to control whether cooldown delays are enforced between agent initializations. This helps

AbhimanyuAryan and others added 9 commits October 25, 2025 14:41

Adding rate limit to gemini models for 1 minute

e1e7d68

Moving rate limit to toggle cli flag

0b4b79a

gemini bug passing rate limit to config

a8d9d12

Merge remote-tracking branch 'origin/main' into rate-limit-gemini

0de2c98

Remove unrelated AGENT_STARTUP_RATE_LIMITING.md documentation

de10e10

AbhimanyuAryan marked this pull request as draft October 27, 2025 23:52

ncrispino added this to the v0.1.7 milestone Oct 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Rate limit gemini based on different metric #383

Rate limit gemini based on different metric #383

AbhimanyuAryan commented Oct 27, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Rate limit gemini based on different metric #383

Are you sure you want to change the base?

Rate limit gemini based on different metric #383

Conversation

AbhimanyuAryan commented Oct 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Feat: Rate limit for Gemini models

Description

Type of change

Checklist

Pre-commit status

How to Test

Test CLI Command

Expected Results

Additional context

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

AbhimanyuAryan commented Oct 27, 2025 •

edited

Loading