Skip to content

Add Automated Testing for Bash and PowerShell Scripts #1049

@amondnet

Description

@amondnet

Summary

This proposal adds comprehensive automated testing for the shell scripts in scripts/bash/ and scripts/powershell/ using Bats and Pester testing frameworks, integrated with GitHub Actions CI.

Motivation

Currently, the Spec Kit project has no automated tests for its critical shell scripts (common.sh, setup-plan.sh, check-prerequisites.sh, create-new-feature.sh, update-agent-context.sh). This creates risks:

  • No regression detection when scripts are modified
  • Manual testing burden for contributors
  • No verification that Bash and PowerShell scripts produce identical outputs
  • Difficult to validate cross-platform compatibility

The CONTRIBUTING.md states "Write tests for new functionality" [1], but the scripts predating this guideline remain untested.

Related Issues

Several existing issues demonstrate the need for automated testing:

Script Bugs and Edge Cases:

  • #1038 - Unable to source common.sh when CDPATH is set
  • #1029 - Dates generated by scripts are incorrect
  • #975 - create-new-feature.ps1 ignores existing feature branches when determining next feature number
  • #965 - Scripts REPO_ROOT variable issue in nested folders
  • #957 - PowerShell script issue with apostrophes in arguments

Script Usability Problems:

  • #842 - Copilot/Spec-kit seem unable to write PowerShell scripts that work

These issues would have been caught earlier with automated tests, preventing user friction and reducing maintenance burden.

Proposed Solution

Implement automated testing using industry-standard frameworks:

For Bash scripts: Bats (Bash Automated Testing System)

  • TAP-compliant testing framework [2]
  • Used by major projects (Docker, Kubernetes helpers)
  • CI integration with GitHub Actions [3]

For PowerShell scripts: Pester

  • Official Microsoft-endorsed testing framework [4]
  • Pre-installed on GitHub-hosted runners [5]
  • Native GitHub Actions support

Implementation Plan

Directory Structure

tests/
├── bash/
│   ├── test_helper/        # bats-support, bats-assert, bats-file
│   ├── unit/               # Function-level tests
│   └── integration/        # Workflow tests
├── powershell/
│   ├── unit/               # Function-level tests
│   └── integration/        # Workflow tests
└── fixtures/               # Shared test data

Test Coverage Goals

Unit Tests (70%+ coverage target):

  • common.sh/ps1: Core utility functions (get_repo_root, get_current_branch, find_feature_dir_by_prefix)
  • setup-plan.sh/ps1: Plan template setup, JSON/text output validation
  • check-prerequisites.sh/ps1: Prerequisite validation logic
  • create-new-feature.sh/ps1: Branch numbering, naming conventions, Git/non-Git modes

Integration Tests:

  • update-agent-context.sh/ps1: Full workflow including plan parsing, template generation
  • Git vs non-Git repository scenarios
  • Cross-platform compatibility (Ubuntu, Windows, macOS)
  • Parity tests (Bash and PowerShell produce identical outputs)

CI Integration

GitHub Actions workflow (.github/workflows/test-scripts.yml):

  • Parallel execution: Bash tests (ubuntu-latest), PowerShell tests (windows-latest)
  • Cross-platform smoke tests (all three major OSes)
  • Test result artifacts and reports
  • Runs on every push/PR affecting scripts or tests

Benefits

  1. Regression Prevention: Automated detection of breaking changes
  2. Cross-Platform Validation: Ensures Windows and Unix scripts stay synchronized
  3. Contributor Confidence: Contributors can verify their changes don't break existing functionality
  4. Documentation: Tests serve as executable examples of script usage
  5. Quality Gates: Aligns with project's "Write tests for new functionality" guideline

Phased Rollout

To facilitate review, this will be submitted as multiple PRs:

  1. PR 1: Test infrastructure and fixtures
  2. PR 2: Bash common.sh tests + CI integration
  3. PR 3: PowerShell common.ps1 tests + CI integration
  4. PR 4: Remaining unit tests (setup-plan, check-prerequisites, create-new-feature)
  5. PR 5: Integration tests (update-agent-context, workflows)
  6. PR 6: Documentation and CONTRIBUTING.md updates

Example Test Cases

Bash (Bats):

@test "get_repo_root returns git root when available" {
    cd "$TEST_TEMP_DIR"
    git init
    source scripts/bash/common.sh

    result="$(get_repo_root)"
    assert_equal "$result" "$TEST_TEMP_DIR"
}

@test "get_current_branch respects SPECIFY_FEATURE env var" {
    export SPECIFY_FEATURE="002-my-feature"
    source scripts/bash/common.sh

    result="$(get_current_branch)"
    assert_equal "$result" "002-my-feature"
}

PowerShell (Pester):

Describe 'common.ps1' {
    It 'Get-RepoRoot returns git root when available' {
        Push-Location TestDrive:\
        git init
        . "$PSScriptRoot/../../../scripts/powershell/common.ps1"

        $result = Get-RepoRoot
        $result | Should -Be (Get-Location).Path

        Pop-Location
    }
}

Risks and Mitigations

Risk: CI execution time increase
Mitigation: Parallel job execution, selective test running on relevant path changes

Risk: Flaky tests
Mitigation: Proper temporary directory cleanup, timeout configurations, documented retry policies

Risk: Maintenance burden
Mitigation: Clear test structure, comprehensive documentation, DRY principles with shared fixtures

Alternative Approaches Considered

  1. Shell script linters only (shellcheck, PSScriptAnalyzer)

    • Rejected: Static analysis doesn't verify runtime behavior
  2. Manual testing checklist

    • Rejected: Error-prone, time-consuming, not scalable
  3. Single test framework for both

    • Rejected: No cross-platform framework provides native feel for both ecosystems

References

Documentation

[1] CONTRIBUTING.md - https://github.com/spec-kit/spec-kit/blob/main/CONTRIBUTING.md#submitting-a-pull-request

[2] Bats-core GitHub Repository - https://github.com/bats-core/bats-core

[3] Bats-core Documentation - https://bats-core.readthedocs.io/

[4] Pester Official Website - https://pester.dev/

[5] GitHub Actions: Building and testing PowerShell - https://docs.github.com/en/actions/use-cases-and-examples/building-and-testing/building-and-testing-powershell

[6] Testing Bash with BATS (Opensource.com) - https://opensource.com/article/19/2/testing-bash-bats

[7] PullRequest.com: Testing Bash Scripts with BATS - https://www.pullrequest.com/blog/testing-bash-scripts-with-bats-a-practical-guide/

Related Issues

[8] Issue #1038 - Unable to source common.sh when CDPATH is set - #1038

[9] Issue #1029 - Dates generated by scripts are incorrect - #1029

[10] Issue #987 - create-new-feature shell script is often called incorrectly - #987

[11] Issue #975 - create-new-feature.ps1 ignores existing feature branches - #975

[12] Issue #965 - Scripts REPO_ROOT variable issue - #965

[13] Issue #957 - PowerShell script issue with apostrophes - #957

[14] Issue #842 - Copilot/Spec-kit seem unable to write PowerShell scripts that work - #842

[15] Issue #338 - Problems detected by shellcheck on the provided bash scripts - #338

AI Assistance Disclosure

This proposal was developed with assistance from PleaseAI + Claude Agent SDK (Anthropic). The analysis, test strategy, and implementation plan were collaboratively created. All technical claims have been verified against official documentation, and the proposed approach follows industry best practices for shell script testing.


@seon-yunjae

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions