Skip to content

Conversation

@AbhimanyuAryan
Copy link

@AbhimanyuAryan AbhimanyuAryan commented Oct 27, 2025

Feat: Rate limit for Gemini models

Valid types:

  • fix: - Bug fixes
  • feat: - New features
  • breaking: - Breaking changes
  • docs: - Documentation updates
  • refactor: - Code refactoring
  • test: - Test additions/modifications
  • chore: - Maintenance tasks
  • perf: - Performance improvements
  • style: - Code style changes
  • ci: - CI/CD configuration changes

Examples:

  • fix: resolve memory leak in data processing
  • feat: add export to CSV functionality
  • breaking: change API response format
  • docs: update installation guide

Description

Brief description of the changes in this PR

Type of change

  • Bug fix (fix:) - Non-breaking change which fixes an issue
  • New feature (feat:) - Non-breaking change which adds functionality
  • Breaking change (breaking:) - Fix or feature that would cause existing functionality to not work as expected
  • Documentation (docs:) - Documentation updates
  • Code refactoring (refactor:) - Code changes that neither fix a bug nor add a feature
  • Tests (test:) - Adding missing tests or correcting existing tests
  • Chore (chore:) - Maintenance tasks, dependency updates, etc.
  • Performance improvement (perf:) - Code changes that improve performance
  • Code style (style:) - Changes that do not affect the meaning of the code (formatting, missing semi-colons, etc.)
  • CI/CD (ci:) - Changes to CI/CD configuration files and scripts

Checklist

  • I have run pre-commit on my changed files and all checks pass
  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Pre-commit status

# Paste the output of running pre-commit on your changed files:
# uv run pre-commit install
# git diff --name-only HEAD~1 | xargs uv run pre-commit run --files # for last commit
# git diff --name-only origin/<base branch>...HEAD | xargs uv run pre-commit run --files # for all commits in PR
# git add <your file> # if any fixes were applied
# git commit -m "chore: apply pre-commit fixes"
# git push origin <branch-name>

How to Test

Add test method for this PR.

Test CLI Command

Write down the test bash command. If there is pre-requests, please emphasize.

massgen --config massgen/configs/basic/multi/seven_gemini_agents.yaml --rate-limit  "France has had a lot of capital in past. Which one is the most popular and why?"

Expected Results

Screenshot 2025-10-27 at 22 45 39

Additional context

The goal is to implement rate limiter logic for any backend model, so it doesn’t make endless requests and spam the API.

AbhimanyuAryan and others added 9 commits October 25, 2025 14:41
- Added configurable rate limiting system with RPM (requests/min), TPM (tokens/min), and RPD (requests/day) limits
- Created external YAML configuration for rate limits with conservative defaults for Gemini models
- Enhanced GlobalRateLimiter with new MultiRateLimiter class using sliding windows for precise limit tracking
- Integrated rate limiting into GeminiBackend with automatic token usage tracking
- Added comprehensive documentation
- Moved rate limits to centralized configuration file instead of hardcoding in orchestrator
- Added mandatory cooldown periods after agent startup to prevent API burst calls
- Implemented _load_rate_limits_from_config() to dynamically load and validate rate limits
- Added conservative rate limiting for low-RPM models (<=2 RPM)
- Added detailed logging for rate limit configuration and cooldown periods
- Added fallback defaults if config loading fails
- Removed three deprecated markdown files related to rate limiting documentation:
  - CHANGELOG_RATE_LIMITING.md
  - QUICK_START_RATE_LIMITING.md
  - RATE_LIMITING_COMPLETE.md
- These files contained old implementation details that have since been superseded by newer documentation in the main docs directory
- Added separate rate limits for Gemini Flash (9 RPM) and Pro (2 RPM) models
- Removed shared 7 RPM limit to prevent rate limit errors
- Updated orchestrator startup timing to handle model-specific limits
- Added documentation explaining rate limit changes and performance implications
- Included recommendations for optimizing agent configurations based on rate limits
- Added test instructions and example logs for verification
@AbhimanyuAryan AbhimanyuAryan marked this pull request as draft October 27, 2025 23:52
- Added enable_rate_limit flag to control agent startup rate limiting behavior
- Modified Orchestrator to skip rate limit checks when feature is disabled
- Updated CLI to pass rate limit configuration through to Orchestrator
- Added documentation for new enable_rate_limit parameter

The changes introduce an opt-in rate limiting system for agent startup, allowing users to control whether cooldown delays are enforced between agent initializations. This helps
@ncrispino ncrispino added this to the v0.1.7 milestone Oct 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants