Skip to content

Conversation

@oscarvalenzuelab
Copy link
Contributor

Summary

This PR enhances the cost estimation model with advanced metrics beyond COCOMO and SLOC:

  • Halstead complexity metrics: Provides automated code complexity measurement (volume, difficulty, effort, estimated bugs)
  • Git history analysis: Captures repository maturity metrics (commits, contributors, age, releases, bus factor)
  • Maintainability Index: Calculates Microsoft's Maintainability Index (0-100 scale) combining Halstead volume, cyclomatic complexity, LOC, and comments
  • New cost multipliers:
    • Maturity multiplier (1.0x - 2.5x) based on project age, contributors, and commit count
    • Halstead multiplier (0.8x - 1.8x) based on code difficulty
  • Updated COCOMO II and SLOCCount estimators to incorporate all multipliers
  • Comprehensive test coverage with 50+ new tests (87 total, all passing)

Problem Solved

The previous model significantly underestimated costs for mature, complex projects (e.g., ReactJS estimated at only $16K). These new metrics account for:

  • Project scale and organizational complexity
  • Code complexity beyond basic SLOC
  • Historical development patterns
  • Maintainability factors

Test Plan

  • All 87 tests passing
  • Unit tests for Halstead analyzer (8 tests)
  • Unit tests for git history analyzer (9 tests)
  • Unit tests for maintainability calculator (10 tests)
  • Unit tests for new multipliers (13 tests)
  • Updated COCOMO estimator tests (7 tests)
  • End-to-end integration tests (4 tests)
  • Existing tests remain passing

Enhance cost estimation with additional metrics beyond COCOMO and SLOC:

- Add Halstead complexity metrics (volume, difficulty, effort, bugs)
- Add git history analysis (commits, contributors, age, releases, bus factor)
- Add Maintainability Index calculation (Microsoft's formula)
- Add maturity multiplier based on project age and scale
- Add Halstead-based complexity multiplier
- Update COCOMO estimator to incorporate all new multipliers
- Add comprehensive unit tests for all new analyzers
- Add end-to-end integration tests

These metrics provide more accurate cost estimates for mature projects
by accounting for repository history, code complexity, and maintainability.
Extend Halstead complexity metrics to support 12 programming languages
beyond Python, using tree-sitter for universal AST parsing:

- Add tree-sitter and tree-sitter-languages dependencies
- Pin tree-sitter to 0.21.x for compatibility
- Implement language detection via file extensions
- Add operator/operand type mappings for each language
  - JavaScript, TypeScript, Java, C, C++, C#, Go, Rust, PHP, Ruby, Swift
- Maintain graceful fallback to Python AST when tree-sitter unavailable
- Add 7 new multi-language test cases
  - Language detection tests
  - JavaScript, TypeScript, Java, Go, Rust file analysis tests
  - Multi-language directory analysis tests
- Fix test for invalid syntax (tree-sitter is resilient to errors)
- Update CHANGELOG with multi-language feature details

All 94 tests passing with improved language coverage.
The tree-sitter-languages package doesn't support Python 3.13 yet.
Move tree-sitter dependencies to optional extras for graceful degradation:

- Create [multilang] optional dependency group for tree-sitter
- Restrict tree-sitter to Python <3.13 via environment markers
- Add Python 3.13 to classifiers (now officially supported)
- Update test workflow to test on Python 3.10-3.13
- Conditionally install multilang extras on Python <3.13 in CI
- Update CHANGELOG to document optional dependency

Without [multilang] installed:
- Python-only Halstead analysis works via built-in AST
- All other features work normally
- Graceful fallback already implemented via TREE_SITTER_AVAILABLE flag

With [multilang] installed (Python 3.10-3.12):
- Full multi-language Halstead support for 12 languages
- JavaScript, TypeScript, Java, C, C++, C#, Go, Rust, PHP, Ruby, Swift

Installation: pip install ossval[multilang]
Fix CI failures on Python 3.13 where tree-sitter is not available:

- Add pytest.skipif decorators to multi-language tests
- Skip JavaScript, TypeScript, Java, Go, Rust tests when tree-sitter unavailable
- Skip multi-language directory test when tree-sitter unavailable
- Update invalid syntax test to handle both scenarios:
  - With tree-sitter: Returns metrics (resilient parsing)
  - Without tree-sitter: Returns None (AST fails on invalid syntax)
- Import TREE_SITTER_AVAILABLE flag for test conditionals

Test results:
- With tree-sitter: 15 tests pass (9 Python + 6 multi-language)
- Without tree-sitter: 9 tests pass, 6 tests skip (expected behavior)
- Full suite: 94 tests pass on all Python versions

This ensures CI passes on Python 3.13 without tree-sitter while maintaining
full test coverage on Python 3.10-3.12 with [multilang] extras installed.
The workflow was manually installing pytest/ruff/black instead of using
the [dev] extras group, which meant pytest-cov was missing. This caused
pytest to fail since pyproject.toml has coverage options in addopts.

- Install [dev] extras instead of individual packages
- This ensures pytest-cov, mypy, and all other dev tools are installed
- Aligns PR validation with the main test workflow
Consolidate CI workflows to avoid redundant test execution:

Before:
- test.yml: 8 test jobs (4 Python versions × 2 OS) on PR to main
- pr-validation.yml: 1 test job (Python 3.13) on all PRs
- Total: 9 test runs for PRs to main (duplicated)

After:
- test.yml: Comprehensive testing (multiple versions/OS) on PR to main
- pr-validation.yml: Quick validation (linting, formatting, docs only)
- Total: 8 test runs (no duplication)

Changes:
- Rename job from 'test-and-lint' to 'lint'
- Remove pytest test execution from pr-validation.yml
- Remove [dev] extras installation (only need ruff and black)
- Keep black formatting and ruff linting checks
- Keep documentation file verification

Tests are still run comprehensively via test.yml which covers:
- Python 3.10, 3.11, 3.12, 3.13
- Ubuntu and macOS
- With and without tree-sitter (multilang extras)
@oscarvalenzuelab oscarvalenzuelab added bug Something isn't working enhancement New feature or request help wanted Extra attention is needed labels Dec 9, 2025
@diegojorquera diegojorquera merged commit de62c67 into main Dec 9, 2025
14 checks passed
@oscarvalenzuelab oscarvalenzuelab deleted the add-advanced-metrics branch December 9, 2025 11:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working enhancement New feature or request help wanted Extra attention is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants