-
Notifications
You must be signed in to change notification settings - Fork 27
Description
Last generated: 2026-01-19T18:36:26.686Z
Provider: openai
Model: gpt-5.2
Summary
Automate a reliable “generate → verify → test → publish” pipeline for this CDP client library, centered on reproducible code generation, static checks, and a single canonical CI workflow. Reduce repo noise (committed SQLite/Bish artifacts, large jar) and make regeneration failures impossible to miss.
Direction (what and why)
What: Treat the generated cdp/ package as a build artifact derived from pinned upstream protocol inputs, and enforce in CI that:
- generation is deterministic,
- generated code matches what’s committed (or alternatively, stop committing generated code and publish built artifacts—but that’s a bigger shift),
- packaging/tests/type checks run consistently across supported Python versions.
Why: This repo’s core risk is drift between generator/ inputs and committed cdp/ outputs, plus CI sprawl from many synced workflows. A tight, explicit “regen check” and a small set of quality gates will reduce maintainer toil and prevent shipping broken protocol bindings.
Plan (next 1–3 steps)
1) Add a single, repo-owned CI workflow for Python quality gates
Create: .github/workflows/ci.yml (keep the org-synced workflows, but this one becomes the authoritative signal for code health).
Include jobs:
- lint/type:
ruff(or keep minimal if you don’t want new deps),mypyusing existingmypy.ini - tests: run
python -m pytestif pytest exists; otherwise run the repo’s existing scripts (see below) - package/import sanity:
python -c "import cdp; print(cdp.__version__ if hasattr(cdp,'__version__') else 'ok')"
Concrete commands (adapt to poetry since pyproject.toml + poetry.lock exist):
poetry install --no-interactionpoetry run python test_import.pypoetry run python minimal_test.pypoetry run python comprehensive_test.py(if stable in CI)poetry run mypy cdp(orpoetry run mypy .if configured)
Matrix:
- Python 3.9–3.12 (adjust to your supported versions in
pyproject.toml)
2) Make code generation reproducible and enforce “no drift” in CI
Add:
scripts/regenerate.sh(new) that runs generation in one place, e.g.:poetry run python generator/generate.py(or whatever the generator entrypoint is)- then formats (if you adopt formatting)
scripts/check_generated.sh(new) that:- runs regeneration
- fails if
git diff --exit-code -- cdpshows changes
Then add a CI job generated-check that runs scripts/check_generated.sh.
This creates a hard guarantee: PRs that change generator logic or protocol inputs must also update generated output.
3) Reduce committed binary/noise artifacts and prevent reintroduction
Observed noise candidates:
**/.bish-index,**/.bish.sqlite(currently committed in multiple dirs)bfg-1.15.0.jar(large binary at repo root)
Actions:
- Update
.gitignoreto include:**/.bish-index**/.bish.sqlite
- Remove them from git history going forward (smallest safe step: remove from current tree):
git rm -f **/.bish-index **/.bish.sqlite
- Decide on
bfg-1.15.0.jar:- If not required for normal users, remove from repo and document in
docs/how to obtain it when needed. - If required, move it under
tools/and document why it exists; ideally fetch in CI on demand instead of committing.
- If not required for normal users, remove from repo and document in
(If removing these files is too disruptive right now, at least ignore them and stop updating them.)
Risks/unknowns
- Generator entrypoint/inputs are unclear:
generator/generate.pyexists, but we need to confirm whether it fetches upstream CDP schemas or uses vendored JSON. If it downloads from the network, CI reproducibility will suffer—pin to a versioned URL or vendor the protocol JSON with checksums. - Test scripts vs pytest: The repo uses
*_test.pyscripts and shell scripts. CI should use the same mechanisms to avoid rewriting tests prematurely. - Org-synced workflows may conflict: Many
.github/workflows/auto-*.ymlare present. Ensure the newci.ymlis required in branch protection, and treat the rest as advisory/automation.
Suggested tests
Add/standardize these checks (run locally and in CI):
- Import sanity
poetry run python test_import.py
- Minimal smoke
poetry run python minimal_test.py
- Comprehensive regression
poetry run python comprehensive_test.py
- Generated code drift
./scripts/check_generated.sh
- Type check
poetry run mypy cdp
Progress verification checklist
-
ci.ymlruns on PRs and is set as a required check - CI fails if regenerated
cdp/differs from committed output -
.bish-*files are no longer tracked and don’t reappear in PRs - A contributor can run:
poetry install && ./scripts/regenerate.sh && ./scripts/check_generated.shsuccessfully