Skip to content

Automation: Direction #117

@github-actions

Description

@github-actions

Last generated: 2026-01-19T18:36:26.686Z
Provider: openai
Model: gpt-5.2

Summary

Automate a reliable “generate → verify → test → publish” pipeline for this CDP client library, centered on reproducible code generation, static checks, and a single canonical CI workflow. Reduce repo noise (committed SQLite/Bish artifacts, large jar) and make regeneration failures impossible to miss.

Direction (what and why)

What: Treat the generated cdp/ package as a build artifact derived from pinned upstream protocol inputs, and enforce in CI that:

  1. generation is deterministic,
  2. generated code matches what’s committed (or alternatively, stop committing generated code and publish built artifacts—but that’s a bigger shift),
  3. packaging/tests/type checks run consistently across supported Python versions.

Why: This repo’s core risk is drift between generator/ inputs and committed cdp/ outputs, plus CI sprawl from many synced workflows. A tight, explicit “regen check” and a small set of quality gates will reduce maintainer toil and prevent shipping broken protocol bindings.

Plan (next 1–3 steps)

1) Add a single, repo-owned CI workflow for Python quality gates

Create: .github/workflows/ci.yml (keep the org-synced workflows, but this one becomes the authoritative signal for code health).

Include jobs:

  • lint/type: ruff (or keep minimal if you don’t want new deps), mypy using existing mypy.ini
  • tests: run python -m pytest if pytest exists; otherwise run the repo’s existing scripts (see below)
  • package/import sanity: python -c "import cdp; print(cdp.__version__ if hasattr(cdp,'__version__') else 'ok')"

Concrete commands (adapt to poetry since pyproject.toml + poetry.lock exist):

  • poetry install --no-interaction
  • poetry run python test_import.py
  • poetry run python minimal_test.py
  • poetry run python comprehensive_test.py (if stable in CI)
  • poetry run mypy cdp (or poetry run mypy . if configured)

Matrix:

  • Python 3.9–3.12 (adjust to your supported versions in pyproject.toml)

2) Make code generation reproducible and enforce “no drift” in CI

Add:

  • scripts/regenerate.sh (new) that runs generation in one place, e.g.:
    • poetry run python generator/generate.py (or whatever the generator entrypoint is)
    • then formats (if you adopt formatting)
  • scripts/check_generated.sh (new) that:
    • runs regeneration
    • fails if git diff --exit-code -- cdp shows changes

Then add a CI job generated-check that runs scripts/check_generated.sh.

This creates a hard guarantee: PRs that change generator logic or protocol inputs must also update generated output.

3) Reduce committed binary/noise artifacts and prevent reintroduction

Observed noise candidates:

  • **/.bish-index, **/.bish.sqlite (currently committed in multiple dirs)
  • bfg-1.15.0.jar (large binary at repo root)

Actions:

  1. Update .gitignore to include:
    • **/.bish-index
    • **/.bish.sqlite
  2. Remove them from git history going forward (smallest safe step: remove from current tree):
    • git rm -f **/.bish-index **/.bish.sqlite
  3. Decide on bfg-1.15.0.jar:
    • If not required for normal users, remove from repo and document in docs/ how to obtain it when needed.
    • If required, move it under tools/ and document why it exists; ideally fetch in CI on demand instead of committing.

(If removing these files is too disruptive right now, at least ignore them and stop updating them.)

Risks/unknowns

  • Generator entrypoint/inputs are unclear: generator/generate.py exists, but we need to confirm whether it fetches upstream CDP schemas or uses vendored JSON. If it downloads from the network, CI reproducibility will suffer—pin to a versioned URL or vendor the protocol JSON with checksums.
  • Test scripts vs pytest: The repo uses *_test.py scripts and shell scripts. CI should use the same mechanisms to avoid rewriting tests prematurely.
  • Org-synced workflows may conflict: Many .github/workflows/auto-*.yml are present. Ensure the new ci.yml is required in branch protection, and treat the rest as advisory/automation.

Suggested tests

Add/standardize these checks (run locally and in CI):

  1. Import sanity
    • poetry run python test_import.py
  2. Minimal smoke
    • poetry run python minimal_test.py
  3. Comprehensive regression
    • poetry run python comprehensive_test.py
  4. Generated code drift
    • ./scripts/check_generated.sh
  5. Type check
    • poetry run mypy cdp

Progress verification checklist

  • ci.yml runs on PRs and is set as a required check
  • CI fails if regenerated cdp/ differs from committed output
  • .bish-* files are no longer tracked and don’t reappear in PRs
  • A contributor can run: poetry install && ./scripts/regenerate.sh && ./scripts/check_generated.sh successfully

Metadata

Metadata

Assignees

No one assigned

    Labels

    automationAutomation-generated direction and planning

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions