Add advanced code complexity and git history metrics #1

oscarvalenzuelab · 2025-12-09T06:35:30Z

Summary

This PR enhances the cost estimation model with advanced metrics beyond COCOMO and SLOC:

Halstead complexity metrics: Provides automated code complexity measurement (volume, difficulty, effort, estimated bugs)
Git history analysis: Captures repository maturity metrics (commits, contributors, age, releases, bus factor)
Maintainability Index: Calculates Microsoft's Maintainability Index (0-100 scale) combining Halstead volume, cyclomatic complexity, LOC, and comments
New cost multipliers:
- Maturity multiplier (1.0x - 2.5x) based on project age, contributors, and commit count
- Halstead multiplier (0.8x - 1.8x) based on code difficulty
Updated COCOMO II and SLOCCount estimators to incorporate all multipliers
Comprehensive test coverage with 50+ new tests (87 total, all passing)

Problem Solved

The previous model significantly underestimated costs for mature, complex projects (e.g., ReactJS estimated at only $16K). These new metrics account for:

Project scale and organizational complexity
Code complexity beyond basic SLOC
Historical development patterns
Maintainability factors

Test Plan

All 87 tests passing
Unit tests for Halstead analyzer (8 tests)
Unit tests for git history analyzer (9 tests)
Unit tests for maintainability calculator (10 tests)
Unit tests for new multipliers (13 tests)
Updated COCOMO estimator tests (7 tests)
End-to-end integration tests (4 tests)
Existing tests remain passing

Enhance cost estimation with additional metrics beyond COCOMO and SLOC: - Add Halstead complexity metrics (volume, difficulty, effort, bugs) - Add git history analysis (commits, contributors, age, releases, bus factor) - Add Maintainability Index calculation (Microsoft's formula) - Add maturity multiplier based on project age and scale - Add Halstead-based complexity multiplier - Update COCOMO estimator to incorporate all new multipliers - Add comprehensive unit tests for all new analyzers - Add end-to-end integration tests These metrics provide more accurate cost estimates for mature projects by accounting for repository history, code complexity, and maintainability.

Extend Halstead complexity metrics to support 12 programming languages beyond Python, using tree-sitter for universal AST parsing: - Add tree-sitter and tree-sitter-languages dependencies - Pin tree-sitter to 0.21.x for compatibility - Implement language detection via file extensions - Add operator/operand type mappings for each language - JavaScript, TypeScript, Java, C, C++, C#, Go, Rust, PHP, Ruby, Swift - Maintain graceful fallback to Python AST when tree-sitter unavailable - Add 7 new multi-language test cases - Language detection tests - JavaScript, TypeScript, Java, Go, Rust file analysis tests - Multi-language directory analysis tests - Fix test for invalid syntax (tree-sitter is resilient to errors) - Update CHANGELOG with multi-language feature details All 94 tests passing with improved language coverage.

The tree-sitter-languages package doesn't support Python 3.13 yet. Move tree-sitter dependencies to optional extras for graceful degradation: - Create [multilang] optional dependency group for tree-sitter - Restrict tree-sitter to Python <3.13 via environment markers - Add Python 3.13 to classifiers (now officially supported) - Update test workflow to test on Python 3.10-3.13 - Conditionally install multilang extras on Python <3.13 in CI - Update CHANGELOG to document optional dependency Without [multilang] installed: - Python-only Halstead analysis works via built-in AST - All other features work normally - Graceful fallback already implemented via TREE_SITTER_AVAILABLE flag With [multilang] installed (Python 3.10-3.12): - Full multi-language Halstead support for 12 languages - JavaScript, TypeScript, Java, C, C++, C#, Go, Rust, PHP, Ruby, Swift Installation: pip install ossval[multilang]

Fix CI failures on Python 3.13 where tree-sitter is not available: - Add pytest.skipif decorators to multi-language tests - Skip JavaScript, TypeScript, Java, Go, Rust tests when tree-sitter unavailable - Skip multi-language directory test when tree-sitter unavailable - Update invalid syntax test to handle both scenarios: - With tree-sitter: Returns metrics (resilient parsing) - Without tree-sitter: Returns None (AST fails on invalid syntax) - Import TREE_SITTER_AVAILABLE flag for test conditionals Test results: - With tree-sitter: 15 tests pass (9 Python + 6 multi-language) - Without tree-sitter: 9 tests pass, 6 tests skip (expected behavior) - Full suite: 94 tests pass on all Python versions This ensures CI passes on Python 3.13 without tree-sitter while maintaining full test coverage on Python 3.10-3.12 with [multilang] extras installed.

The workflow was manually installing pytest/ruff/black instead of using the [dev] extras group, which meant pytest-cov was missing. This caused pytest to fail since pyproject.toml has coverage options in addopts. - Install [dev] extras instead of individual packages - This ensures pytest-cov, mypy, and all other dev tools are installed - Aligns PR validation with the main test workflow

Consolidate CI workflows to avoid redundant test execution: Before: - test.yml: 8 test jobs (4 Python versions × 2 OS) on PR to main - pr-validation.yml: 1 test job (Python 3.13) on all PRs - Total: 9 test runs for PRs to main (duplicated) After: - test.yml: Comprehensive testing (multiple versions/OS) on PR to main - pr-validation.yml: Quick validation (linting, formatting, docs only) - Total: 8 test runs (no duplication) Changes: - Rename job from 'test-and-lint' to 'lint' - Remove pytest test execution from pr-validation.yml - Remove [dev] extras installation (only need ruff and black) - Keep black formatting and ruff linting checks - Keep documentation file verification Tests are still run comprehensively via test.yml which covers: - Python 3.10, 3.11, 3.12, 3.13 - Ubuntu and macOS - With and without tree-sitter (multilang extras)

oscarvalenzuelab added 7 commits December 8, 2025 22:32

Bump version to 1.2.1 and update CHANGELOG

dbce376

oscarvalenzuelab assigned oscarvalenzuelab and diegojorquera and unassigned oscarvalenzuelab Dec 9, 2025

oscarvalenzuelab added bug Something isn't working enhancement New feature or request help wanted Extra attention is needed labels Dec 9, 2025

oscarvalenzuelab requested a review from diegojorquera December 9, 2025 07:14

diegojorquera approved these changes Dec 9, 2025

View reviewed changes

diegojorquera merged commit de62c67 into main Dec 9, 2025
14 checks passed

oscarvalenzuelab deleted the add-advanced-metrics branch December 9, 2025 11:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add advanced code complexity and git history metrics #1

Add advanced code complexity and git history metrics #1

Uh oh!

oscarvalenzuelab commented Dec 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add advanced code complexity and git history metrics #1

Add advanced code complexity and git history metrics #1

Uh oh!

Conversation

oscarvalenzuelab commented Dec 9, 2025

Summary

Problem Solved

Test Plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants