Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
136 changes: 136 additions & 0 deletions .github/workflows/cost-center-sync-cached.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
name: Cost Center Sync with Caching

on:
schedule:
# Run every 6 hours (can be adjusted based on needs)
- cron: '0 */6 * * *'
workflow_dispatch:
inputs:
mode:
description: 'Execution mode'
required: true
default: 'plan'
type: choice
options:
- plan
- apply
clear_cache:
description: 'Clear cost center cache before running'
required: false
default: false
type: boolean

jobs:
sync-cost-centers:
runs-on: ubuntu-latest

steps:
- name: Checkout repository
uses: actions/checkout@v4

- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.11'

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt

- name: Restore cost center cache
uses: actions/cache@v4
with:
path: .cache/
key: cost-center-cache-${{ github.repository }}-${{ hashFiles('**/config.yaml') }}
restore-keys: |
cost-center-cache-${{ github.repository }}-
cost-center-cache-

- name: Show cache statistics (before)
run: |
python main.py --cache-stats

- name: Clear cache if requested
if: ${{ inputs.clear_cache == 'true' }}
run: |
echo "Clearing cost center cache as requested..."
python main.py --clear-cache

- name: Clean up expired cache entries
run: |
python main.py --cache-cleanup

- name: Run cost center sync (teams mode)
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
GITHUB_ENTERPRISE: ${{ vars.GITHUB_ENTERPRISE }}
run: |
MODE="${{ inputs.mode || 'plan' }}"
echo "Running cost center sync in $MODE mode..."

if [ "$MODE" = "apply" ]; then
python main.py --teams-mode --assign-cost-centers --mode apply --yes
else
python main.py --teams-mode --assign-cost-centers --mode plan
fi

- name: Show cache statistics (after)
if: always()
run: |
python main.py --cache-stats

- name: Upload cache statistics as artifact
if: always()
uses: actions/upload-artifact@v4
with:
name: cache-stats-${{ github.run_number }}
path: |
.cache/
retention-days: 7

performance-report:
Comment on lines +25 to +92

Check warning

Code scanning / CodeQL

Workflow does not contain permissions Medium

Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {contents: read}

Copilot Autofix

AI 13 days ago

To fix the problem, you should add a permissions block to restrict the permissions of the GITHUB_TOKEN. The most secure and simplest approach is to add the following block at the top level of the workflow (before jobs:), which will restrict all jobs in this workflow to only have read access to repository contents:

permissions:
  contents: read

This setting will ensure the jobs do not have unnecessary write access. If a job (now or in the future) requires additional permissions (e.g., to open issues or PRs, or write to contents), you must explicitly grant only those permissions to the specific jobs needing it. At present, based on the steps in both jobs, only reading contents is necessary (for checkout and artifact actions), so contents: read suffices.

The change should go in .github/workflows/cost-center-sync-cached.yml, inserted between the workflow name: and the on: block (i.e., between lines 1 and 3).

Suggested changeset 1
.github/workflows/cost-center-sync-cached.yml

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/.github/workflows/cost-center-sync-cached.yml b/.github/workflows/cost-center-sync-cached.yml
--- a/.github/workflows/cost-center-sync-cached.yml
+++ b/.github/workflows/cost-center-sync-cached.yml
@@ -1,5 +1,7 @@
 name: Cost Center Sync with Caching
 
+permissions:
+  contents: read
 on:
   schedule:
     # Run every 6 hours (can be adjusted based on needs)
EOF
@@ -1,5 +1,7 @@
name: Cost Center Sync with Caching

permissions:
contents: read
on:
schedule:
# Run every 6 hours (can be adjusted based on needs)
Copilot is powered by AI and may make mistakes. Always verify output.
runs-on: ubuntu-latest
needs: sync-cost-centers
if: always()

steps:
- name: Checkout repository
uses: actions/checkout@v4

- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.11'

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt

- name: Download cache artifacts
uses: actions/download-artifact@v4
with:
name: cache-stats-${{ github.run_number }}
path: .cache/

- name: Generate performance report
run: |
echo "# Cost Center Sync Performance Report" > performance-report.md
echo "" >> performance-report.md
echo "**Run:** ${{ github.run_number }}" >> performance-report.md
echo "**Date:** $(date -u '+%Y-%m-%d %H:%M:%S UTC')" >> performance-report.md
echo "**Mode:** ${{ inputs.mode || 'plan' }}" >> performance-report.md
echo "" >> performance-report.md
echo "## Cache Performance" >> performance-report.md
echo "" >> performance-report.md
echo '```' >> performance-report.md
python main.py --cache-stats >> performance-report.md 2>&1 || true
echo '```' >> performance-report.md

- name: Upload performance report
uses: actions/upload-artifact@v4
with:
name: performance-report-${{ github.run_number }}
path: performance-report.md
retention-days: 30
Comment on lines +93 to +136

Check warning

Code scanning / CodeQL

Workflow does not contain permissions Medium

Actions job or workflow does not limit the permissions of the GITHUB_TOKEN. Consider setting an explicit permissions block, using the following as a minimal starting point: {contents: read}

Copilot Autofix

AI 13 days ago

To fix this problem, an explicit permissions block should be added to the workflow, ideally at the top/root level so that it applies to all jobs by default. According to the principle of least privilege, the default can usually be set to contents: read, which covers common operations like artifact upload/download and source checkout. If any step requires additional permissions (e.g., opening PRs, commenting on issues), those can be set at the job or step level. For the provided workflow, no operation appears to require more than contents: read, so this setting suffices. The block should be added after the name: field and before on:.


Suggested changeset 1
.github/workflows/cost-center-sync-cached.yml

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/.github/workflows/cost-center-sync-cached.yml b/.github/workflows/cost-center-sync-cached.yml
--- a/.github/workflows/cost-center-sync-cached.yml
+++ b/.github/workflows/cost-center-sync-cached.yml
@@ -1,4 +1,6 @@
 name: Cost Center Sync with Caching
+permissions:
+  contents: read
 
 on:
   schedule:
EOF
@@ -1,4 +1,6 @@
name: Cost Center Sync with Caching
permissions:
contents: read

on:
schedule:
Copilot is powered by AI and may make mistakes. Always verify output.
150 changes: 150 additions & 0 deletions CACHING_IMPLEMENTATION.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,150 @@
# Cost Center Caching Implementation Summary

## 🎯 Objective
Optimize the performance of cost center operations by implementing intelligent caching to reduce redundant GitHub API calls, especially beneficial for teams-based mode with multiple cost centers.

## ✅ Implementation Completed

### 1. Core Caching System (`src/cost_center_cache.py`)
- **Smart Cache Management**: Persistent JSON-based cache with configurable TTL (24 hours default)
- **Automatic Expiration**: Time-based cache invalidation to ensure data freshness
- **Thread-Safe Operations**: Atomic file operations to prevent cache corruption
- **Comprehensive Statistics**: Detailed cache performance metrics and monitoring
- **Graceful Error Handling**: Fallback mechanisms for cache corruption or I/O errors

### 2. Integration with Teams Manager (`src/teams_cost_center_manager.py`)
- **Seamless Integration**: Cache automatically used in `ensure_cost_centers_exist` method
- **Performance Metrics**: Real-time cache hit rate reporting and API call tracking
- **Fallback Behavior**: Graceful degradation when cache is unavailable
- **Zero Configuration**: Works out-of-the-box with sensible defaults

### 3. Command Line Interface (`main.py`)
- **Cache Statistics**: `--cache-stats` for performance monitoring
- **Cache Management**: `--clear-cache` and `--cache-cleanup` for maintenance
- **Non-Intrusive**: Cache commands work without requiring GitHub API access
- **Integration**: Cache operations can be combined with normal workflow commands

### 4. GitHub Actions Optimization (`.github/workflows/cost-center-sync-cached.yml`)
- **CI/CD Cache Integration**: Leverages GitHub Actions cache for persistent storage
- **Performance Monitoring**: Automated cache statistics reporting
- **Flexible Configuration**: Support for cache clearing and cleanup operations
- **Artifact Collection**: Performance reports and cache data preservation

### 5. Documentation and Testing
- **Updated README**: Comprehensive documentation with examples and benefits
- **Performance Comparison**: Demonstrative script showing 34,000x+ speed improvement
- **Usage Examples**: Clear command-line examples for all cache operations
- **Best Practices**: Guidelines for optimal cache usage in different scenarios

## 📊 Performance Impact

### Benchmark Results (20 cost centers, 100ms API latency)
- **Cold Start (No Cache)**: 2.00 seconds, 20 API calls
- **Warm Cache**: 0.00 seconds, 0 API calls (100% cache hit rate)
- **Performance Gain**: 34,277x faster execution
- **API Reduction**: 100% fewer API calls for repeat operations
- **Mixed Scenario**: 80% cache hit rate for partial updates

### Real-World Benefits
- ⚡ **Dramatic Speed Improvement**: Near-instantaneous cost center resolution
- 🔄 **Reduced API Rate Limits**: Minimize GitHub API consumption
- 💰 **Cost Efficiency**: Lower computational costs in CI/CD pipelines
- 🎯 **Better User Experience**: Faster feedback during planning and execution
- 📈 **Scalability**: Performance scales with team size and cost center count

## 🔧 Technical Features

### Cache Architecture
```
.cache/cost_centers.json
{
"version": "1.0",
"last_updated": "2024-01-15T10:30:45.123456",
"cost_centers": {
"Engineering Team": {
"id": "cc_12345",
"timestamp": "2024-01-15T10:30:45.123456"
}
}
}
```

### Key Implementation Details
- **TTL Management**: 24-hour default expiration with configurable duration
- **Atomic Operations**: Safe concurrent access and corruption prevention
- **Smart Invalidation**: Configuration-aware cache key generation
- **Memory Efficient**: JSON-based storage with minimal memory footprint
- **Platform Agnostic**: Works across different operating systems and environments

## 🚀 Usage Examples

### Basic Cache Operations
```bash
# View current cache performance
python main.py --cache-stats

# Clear cache when needed
python main.py --clear-cache

# Remove expired entries
python main.py --cache-cleanup
```

### Teams Mode with Caching
```bash
# First run - populates cache
python main.py --teams-mode --assign-cost-centers --mode plan

# Subsequent runs - uses cache (much faster)
python main.py --teams-mode --assign-cost-centers --mode apply --yes
```

### GitHub Actions Integration
```yaml
- name: Restore cost center cache
uses: actions/cache@v4
with:
path: .cache/
key: cost-center-cache-${{ github.repository }}-${{ hashFiles('**/config.yaml') }}
```

## 🎯 Benefits Achieved

### For Development Teams
- **Faster Iteration**: Quick feedback during cost center planning
- **Reduced Waiting**: Near-instant cost center resolution
- **Better Debugging**: Cache statistics help identify performance bottlenecks

### For CI/CD Pipelines
- **Faster Deployments**: Significantly reduced execution time
- **Lower Resource Usage**: Fewer API calls mean lower computational costs
- **Better Reliability**: Reduced dependency on GitHub API response times

### For Enterprise Operations
- **Scalability**: Performance improvement grows with organization size
- **Cost Optimization**: Reduced API usage and computational resources
- **Operational Efficiency**: Faster automation and better user experience

## 🔮 Future Enhancements
- **Distributed Caching**: Support for shared cache across multiple runners
- **Cache Warming**: Pre-populate cache with known cost center mappings
- **Advanced Analytics**: Detailed performance metrics and trend analysis
- **Configuration Integration**: Cache settings in main configuration file

## 📝 Files Modified/Created

### Core Implementation
- `src/cost_center_cache.py` - New caching system
- `src/teams_cost_center_manager.py` - Integrated caching
- `main.py` - Added cache management CLI options

### Documentation & Testing
- `README.md` - Updated with caching documentation
- `scripts/performance_comparison.py` - Performance demonstration
- `.github/workflows/cost-center-sync-cached.yml` - GitHub Actions integration

### Configuration
- `.gitignore` - Already excludes `.cache/` directory
- Cache files stored in `.cache/cost_centers.json` (auto-created)

This implementation provides a robust, high-performance caching solution that significantly improves the user experience while maintaining data integrity and reliability.
53 changes: 51 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -96,12 +96,14 @@ Set up GitHub Actions for automatic syncing every 6 hours - see [Automation](#au
- Enterprise scope: Sync all teams across the enterprise
- Automatic cost center creation with bracket notation naming
- Full sync mode (removes users who left teams)
- Single assignment (multi-team users get last team's assignment)
- Single assignment (existing cost center assignments are preserved by default)

### Additional Features
- 🔄 **Plan/Apply execution**: Preview changes before applying
- 📊 **Enhanced logging**: Real-time success/failure tracking
- 🐳 **Container ready**: Dockerfile and docker-compose included
- � **Smart user handling**: Skips users already in cost centers by default
- ⚡ **Override option**: `--ignore-current-cost-center` to move users between cost centers
- �🐳 **Container ready**: Dockerfile and docker-compose included
- ⚙️ **Automation examples**: GitHub Actions, cron, and shell scripts
- 🔧 **Auto-creation**: Automatic cost center creation (no manual UI setup)

Expand Down Expand Up @@ -217,6 +219,9 @@ python main.py --assign-cost-centers --mode apply
# Apply without confirmation (automation)
python main.py --create-cost-centers --assign-cost-centers --mode apply --yes

# Move users between cost centers if needed
python main.py --assign-cost-centers --ignore-current-cost-center --mode apply

# Incremental mode (only new users, for cron jobs)
python main.py --assign-cost-centers --incremental --mode apply --yes
```
Expand All @@ -233,10 +238,54 @@ python main.py --teams-mode --assign-cost-centers --mode apply
# Apply without confirmation (automation)
python main.py --teams-mode --assign-cost-centers --mode apply --yes

# Move users between cost centers if needed
python main.py --teams-mode --assign-cost-centers --ignore-current-cost-center --mode apply

# Generate summary report
python main.py --teams-mode --summary-report
```

**Smart User Handling:**
- By default, users already in cost centers are skipped to avoid conflicts
- Use `--ignore-current-cost-center` to move users between cost centers

### Cache Management (Performance Optimization)

The tool automatically caches cost center mappings to improve performance by avoiding redundant API calls:

```bash
# View cache statistics
python main.py --cache-stats

# Clear the entire cache
python main.py --clear-cache

# Remove expired cache entries
python main.py --cache-cleanup

# Cache is automatically used during normal operations
python main.py --teams-mode --assign-cost-centers --mode plan # Uses cache
```

**Cache Benefits:**
- 📈 **Significant performance improvement** for teams with many cost centers
- 🔄 **Automatic expiration** (24 hours by default) ensures data freshness
- 💾 **GitHub Actions cache integration** for CI/CD workflows
- 🎯 **Smart invalidation** when configuration changes

**Cache Statistics Example:**
```
===== Cost Center Cache Statistics =====
Cache file: .cache/cost_centers.json
Total entries: 25
Valid entries: 23
Expired entries: 2
Cache TTL: 24.0 hours
Last updated: 2024-01-15T10:30:45.123456
Effective hit rate: 92.0%
==========================================
```

**Note:** Incremental mode is NOT supported in Teams Mode. All team members are processed every run.

## Incremental Processing (PRU Mode Only)
Expand Down
2 changes: 1 addition & 1 deletion TEAMS_INTEGRATION.md
Original file line number Diff line number Diff line change
Expand Up @@ -236,7 +236,7 @@ teams:
- Token doesn't have access to the organization

### Warning: "User X is in multiple teams"
**This is expected**: The tool warns you when a user belongs to multiple teams. The user will be assigned to the LAST team's cost center processed. Review these warnings in plan mode to understand which cost center each multi-team user will be assigned to.
**This is expected**: The tool warns you when a user belongs to multiple teams. By default, users already in cost centers will be skipped to avoid conflicts. Use `--ignore-current-cost-center` if you need to move users between cost centers. Review these warnings in plan mode to understand assignment behavior.

## Best Practices

Expand Down
Loading
Loading