Automation: Direction

Last generated: 2026-01-19T18:36:26.686Z
Provider: openai
Model: gpt-5.2

## Summary
Automate a reliable “generate → verify → test → publish” pipeline for this CDP client library, centered on reproducible code generation, static checks, and a single canonical CI workflow. Reduce repo noise (committed SQLite/Bish artifacts, large jar) and make regeneration failures impossible to miss.

## Direction (what and why)
**What:** Treat the generated `cdp/` package as a build artifact derived from pinned upstream protocol inputs, and enforce in CI that:
1) generation is deterministic,
2) generated code matches what’s committed (or alternatively, stop committing generated code and publish built artifacts—but that’s a bigger shift),
3) packaging/tests/type checks run consistently across supported Python versions.

**Why:** This repo’s core risk is drift between `generator/` inputs and committed `cdp/` outputs, plus CI sprawl from many synced workflows. A tight, explicit “regen check” and a small set of quality gates will reduce maintainer toil and prevent shipping broken protocol bindings.

## Plan (next 1–3 steps)

### 1) Add a single, repo-owned CI workflow for Python quality gates
Create: **`.github/workflows/ci.yml`** (keep the org-synced workflows, but this one becomes the authoritative signal for code health).

Include jobs:
- **lint/type**: `ruff` (or keep minimal if you don’t want new deps), `mypy` using existing `mypy.ini`
- **tests**: run `python -m pytest` if pytest exists; otherwise run the repo’s existing scripts (see below)
- **package/import sanity**: `python -c "import cdp; print(cdp.__version__ if hasattr(cdp,'__version__') else 'ok')"`

Concrete commands (adapt to poetry since `pyproject.toml` + `poetry.lock` exist):
- `poetry install --no-interaction`
- `poetry run python test_import.py`
- `poetry run python minimal_test.py`
- `poetry run python comprehensive_test.py` (if stable in CI)
- `poetry run mypy cdp` (or `poetry run mypy .` if configured)

Matrix:
- Python **3.9–3.12** (adjust to your supported versions in `pyproject.toml`)

### 2) Make code generation reproducible and enforce “no drift” in CI
Add:
- **`scripts/regenerate.sh`** (new) that runs generation in one place, e.g.:
  - `poetry run python generator/generate.py` (or whatever the generator entrypoint is)
  - then formats (if you adopt formatting)
- **`scripts/check_generated.sh`** (new) that:
  - runs regeneration
  - fails if `git diff --exit-code -- cdp` shows changes

Then add a CI job **generated-check** that runs `scripts/check_generated.sh`.

This creates a hard guarantee: PRs that change generator logic or protocol inputs must also update generated output.

### 3) Reduce committed binary/noise artifacts and prevent reintroduction
Observed noise candidates:
- `**/.bish-index`, `**/.bish.sqlite` (currently committed in multiple dirs)
- `bfg-1.15.0.jar` (large binary at repo root)

Actions:
1) Update **`.gitignore`** to include:
   - `**/.bish-index`
   - `**/.bish.sqlite`
2) Remove them from git history going forward (smallest safe step: remove from current tree):
   - `git rm -f **/.bish-index **/.bish.sqlite`
3) Decide on `bfg-1.15.0.jar`:
   - If not required for normal users, remove from repo and document in `docs/` how to obtain it when needed.
   - If required, move it under `tools/` and document why it exists; ideally fetch in CI on demand instead of committing.

(If removing these files is too disruptive right now, at least ignore them and stop updating them.)

## Risks/unknowns
- **Generator entrypoint/inputs are unclear**: `generator/generate.py` exists, but we need to confirm whether it fetches upstream CDP schemas or uses vendored JSON. If it downloads from the network, CI reproducibility will suffer—pin to a versioned URL or vendor the protocol JSON with checksums.
- **Test scripts vs pytest**: The repo uses `*_test.py` scripts and shell scripts. CI should use the same mechanisms to avoid rewriting tests prematurely.
- **Org-synced workflows may conflict**: Many `.github/workflows/auto-*.yml` are present. Ensure the new `ci.yml` is required in branch protection, and treat the rest as advisory/automation.

## Suggested tests
Add/standardize these checks (run locally and in CI):
1) **Import sanity**
   - `poetry run python test_import.py`
2) **Minimal smoke**
   - `poetry run python minimal_test.py`
3) **Comprehensive regression**
   - `poetry run python comprehensive_test.py`
4) **Generated code drift**
   - `./scripts/check_generated.sh`
5) **Type check**
   - `poetry run mypy cdp`

### Progress verification checklist
- [ ] `ci.yml` runs on PRs and is set as a required check
- [ ] CI fails if regenerated `cdp/` differs from committed output
- [ ] `.bish-*` files are no longer tracked and don’t reappear in PRs
- [ ] A contributor can run: `poetry install && ./scripts/regenerate.sh && ./scripts/check_generated.sh` successfully

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Automation: Direction #117

Summary

Direction (what and why)

Plan (next 1–3 steps)

1) Add a single, repo-owned CI workflow for Python quality gates

2) Make code generation reproducible and enforce “no drift” in CI

3) Reduce committed binary/noise artifacts and prevent reintroduction

Risks/unknowns

Suggested tests

Progress verification checklist

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Automation: Direction #117

Description

Summary

Direction (what and why)

Plan (next 1–3 steps)

1) Add a single, repo-owned CI workflow for Python quality gates

2) Make code generation reproducible and enforce “no drift” in CI

3) Reduce committed binary/noise artifacts and prevent reintroduction

Risks/unknowns

Suggested tests

Progress verification checklist

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions