From 4841ede59b8f340221a27098219eb66a1091e88c Mon Sep 17 00:00:00 2001
From: Parthib Mukherjee <parthibmukherjee@gmail.com>
Date: Wed, 24 Sep 2025 14:01:50 +0000
Subject: [PATCH 1/4] added(AGENTS.MD): added AGENTS.MD Signed-off-by: Parthib
 Mukherjee <parthibmukherjee@gmail.com>

---
 AGENTS.md | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)
 create mode 100644 AGENTS.md
diff --git a/AGENTS.md b/AGENTS.md
new file mode 100644
index 000000000..5ff140c59
--- /dev/null
+++ b/AGENTS.md
@@ -0,0 +1,21 @@
+AGENTS: Quick Guide for kubeflow/sdk
+
+- Setup: use `uv`. Create venv and sync deps with `make install-dev`.
+- Lint/format (CI parity): `make verify` (runs `ruff check --show-fixes` and `ruff format --check`).
+- One-off lint/format locally: `uv run ruff check --fix .` then `uv run ruff format kubeflow`.
+- Pre-commit: `uv run pre-commit install`; run all hooks with `uv run pre-commit run --all-files`.
+- Run all unit tests + coverage: `make test-python` (HTML by default; XML with `make test-python report=xml`).
+- Run tests for one file: `uv run pytest -q kubeflow/trainer/utils/utils_test.py`.
+- Run a single test: `uv run pytest -q kubeflow/trainer/utils/utils_test.py::test_name -k "pattern"`.
+- Coverage for ad-hoc runs: `uv run coverage run -m pytest <path>` then `uv run coverage report`.
+- Packaging: project uses Hatchling; optional build with `uv build`.
+
+Code style (ruff manages lint + format)
+- Line length 100; target Python 3.9; double quotes; spaces indent; docstring code wrapped at 100.
+- Imports: isort via ruff; first-party is `kubeflow`; combine `as` imports; force sort within sections; prefer absolute imports.
+- Naming: pep8-naming enforced; functions/vars `snake_case`, classes `PascalCase`, constants `UPPER_SNAKE_CASE`; prefix private with `_`.
+- Types: annotate public APIs and tests; avoid `Any`; include return types; prefer `TypedDict`, `Literal`, `Enum`; use Pydantic v2 models in `kubeflow.trainer.types` for data schemas.
+- Errors: raise specific exceptions; avoid bare `except`; use `raise ... from err` for chaining; validate inputs early (Pydantic when applicable).
+- Tests: place under `kubeflow/trainer/**` as `*_test.py`; use pytest style and fixtures (see `kubeflow/trainer/test/common.py`); avoid external I/O in unit tests.
+- CI: PR titles must follow Conventional Commits (types: chore, fix, feat, revert; scopes: ci, docs, examples, scripts, test, trainer). CI runs `make verify` and tests on 3.9/3.11.
+- Help: `make help` lists available targets.

From 50dff646761ddd58ecbaf6372f5ddf0a2afcb454 Mon Sep 17 00:00:00 2001
From: Parthib Mukherjee <parthibmukherjee@gmail.com>
Date: Mon, 29 Sep 2025 19:53:25 +0000
Subject: [PATCH 2/4] docs: Revise AGENTS.md for comprehensive guidance on
 setup, tooling, and development principles

Signed-off-by: Parthib Mukherjee <parthibmukherjee@gmail.com>
---
 AGENTS.md | 398 +++++++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 377 insertions(+), 21 deletions(-)

diff --git a/AGENTS.md b/AGENTS.md
index 5ff140c59..a945fcec0 100644
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -1,21 +1,377 @@
-AGENTS: Quick Guide for kubeflow/sdk
-
-- Setup: use `uv`. Create venv and sync deps with `make install-dev`.
-- Lint/format (CI parity): `make verify` (runs `ruff check --show-fixes` and `ruff format --check`).
-- One-off lint/format locally: `uv run ruff check --fix .` then `uv run ruff format kubeflow`.
-- Pre-commit: `uv run pre-commit install`; run all hooks with `uv run pre-commit run --all-files`.
-- Run all unit tests + coverage: `make test-python` (HTML by default; XML with `make test-python report=xml`).
-- Run tests for one file: `uv run pytest -q kubeflow/trainer/utils/utils_test.py`.
-- Run a single test: `uv run pytest -q kubeflow/trainer/utils/utils_test.py::test_name -k "pattern"`.
-- Coverage for ad-hoc runs: `uv run coverage run -m pytest <path>` then `uv run coverage report`.
-- Packaging: project uses Hatchling; optional build with `uv build`.
-
-Code style (ruff manages lint + format)
-- Line length 100; target Python 3.9; double quotes; spaces indent; docstring code wrapped at 100.
-- Imports: isort via ruff; first-party is `kubeflow`; combine `as` imports; force sort within sections; prefer absolute imports.
-- Naming: pep8-naming enforced; functions/vars `snake_case`, classes `PascalCase`, constants `UPPER_SNAKE_CASE`; prefix private with `_`.
-- Types: annotate public APIs and tests; avoid `Any`; include return types; prefer `TypedDict`, `Literal`, `Enum`; use Pydantic v2 models in `kubeflow.trainer.types` for data schemas.
-- Errors: raise specific exceptions; avoid bare `except`; use `raise ... from err` for chaining; validate inputs early (Pydantic when applicable).
-- Tests: place under `kubeflow/trainer/**` as `*_test.py`; use pytest style and fixtures (see `kubeflow/trainer/test/common.py`); avoid external I/O in unit tests.
-- CI: PR titles must follow Conventional Commits (types: chore, fix, feat, revert; scopes: ci, docs, examples, scripts, test, trainer). CI runs `make verify` and tests on 3.9/3.11.
-- Help: `make help` lists available targets.
+AGENTS: Guide for kubeflow/sdk
+
+## Who This Is For
+
+- **AI agents**: Automate repository tasks with minimal context
+- **Contributors**: Humans using AI assistants or working directly
+- **Maintainers**: Ensure assistants follow project conventions and CI rules
+
+## What This Document Provides
+
+- Environment setup and canonical commands for format, lint, and tests
+- Repository map and conventions to keep changes consistent
+- Guardrails for PRs, CI, and releases
+- Quick references for common tasks and troubleshooting
+
+## Project Overview
+
+**Purpose**: Kubeflow SDK provides a unified Python SDK for AI practitioners to interact with multiple Kubeflow projects via consistent APIs, focusing on user workflows over infrastructure details.
+
+**Problem It Solves**: Reduces Kubernetes and multi-project complexity, offering simple, local-first Python interfaces for training, tuning, pipelines (planned), and model lifecycle management.
+
+**Key Benefits**:
+- Unified experience across Kubeflow projects
+- Simplified AI workflows with minimal infrastructure knowledge
+- Local development support (install via `pip`) with optional cluster backends
+
+**Today's Scope**:
+- **Available**: Kubeflow Trainer (train/fine-tune with different backends)
+- **Planned**: Katib (HPO), Pipelines (workflows), Model Registry
+- See README "Supported Kubeflow Projects" for current status
+
+## Repository Map
+
+```
+kubeflow/trainer/           # Trainer component
+├── backends/kubernetes/    # K8s backend implementation + tests
+├── backends/localprocess/  # Local process backend
+├── api/                   # Client API, TrainerClient
+├── types/                 # Pydantic v2 data models
+└── utils/                 # Shared helpers + tests
+docs/                      # Diagrams and proposals
+scripts/                   # Project scripts (e.g., changelog)
+Root files: AGENTS.md, README.md, pyproject.toml, Makefile, CI workflows
+```
+
+## Environment & Tooling
+
+- **Package manager**: `uv` (creates `.venv` automatically via targets)
+- **Lint/format**: `ruff` (isort integrated)
+- **Tests**: `pytest` with coverage
+- **Build**: Hatchling (optional `uv build`)
+- **Pre-commit**: Config provided and enforced in CI
+
+## Quick Start
+
+**Setup**:
+```bash
+make install-dev              # Install uv, create .venv, sync deps
+```
+
+**Verify (CI parity)**:
+```bash
+make verify                   # Runs ruff check --show-fixes and ruff format --check
+```
+
+**Testing**:
+```bash
+make test-python              # All unit tests + coverage (HTML by default)
+make test-python report=xml   # XML coverage report
+uv run pytest -q kubeflow/trainer/utils/utils_test.py                    # One file
+uv run pytest -q kubeflow/trainer/utils/utils_test.py::test_name -k "pattern"  # One test
+uv run coverage run -m pytest <path> && uv run coverage report          # Ad-hoc coverage
+```
+
+**Local lint/format**:
+```bash
+uv run ruff check --fix .     # Fix lint issues
+uv run ruff format kubeflow   # Format code
+```
+
+**Type checking**:
+```bash
+uv run mypy kubeflow          # Run type checker
+```
+
+**Pre-commit**:
+```bash
+uv run pre-commit install                    # Install hooks
+uv run pre-commit run --all-files           # Run all hooks
+```
+
+## Core Development Principles
+
+### 1. Maintain Stable Public Interfaces ⚠️ CRITICAL
+
+**Always attempt to preserve function signatures, argument positions, and names for exported/public methods.**
+
+❌ **Bad - Breaking Change:**
+```python
+def train_model(id, verbose=False):  # Changed from `model_id`
+    pass
+```
+
+✅ **Good - Stable Interface:**
+```python
+def train_model(model_id: str, verbose: bool = False) -> TrainingResult:
+    """Train model with optional verbose output."""
+    pass
+```
+
+**Before making ANY changes to public APIs:**
+- Check if the function/class is exported in `__init__.py`
+- Look for existing usage patterns in tests and examples
+- Use keyword-only arguments for new parameters: `*, new_param: str = "default"`
+- Mark experimental features clearly with docstring warnings
+
+### 2. Code Quality Standards
+
+**All Python code MUST include type hints and return types.**
+
+❌ **Bad:**
+```python
+def p(u, d):
+    return [x for x in u if x not in d]
+```
+
+✅ **Good:**
+```python
+def filter_completed_jobs(jobs: list[str], completed: set[str]) -> list[str]:
+    """Filter out jobs that are already completed.
+    
+    Args:
+        jobs: List of job identifiers to filter.
+        completed: Set of completed job identifiers.
+        
+    Returns:
+        List of jobs that are not yet completed.
+    """
+    return [job for job in jobs if job not in completed]
+```
+
+**Style Requirements:**
+- Line length 100, Python 3.9 target, double quotes, spaces indent
+- Imports: isort via ruff; first-party is `kubeflow`; prefer absolute imports
+- Naming: pep8-naming; functions/vars `snake_case`, classes `PascalCase`, constants `UPPER_SNAKE_CASE`; prefix private with `_`
+- Use descriptive, self-explanatory variable names. Avoid overly short or cryptic identifiers
+- Break up complex functions (>20 lines) into smaller, focused functions where it makes sense
+- Follow existing patterns in the codebase you're modifying
+
+### 3. Testing Requirements
+
+**Every new feature or bugfix MUST be covered by unit tests.**
+
+**Test Organization:**
+- Unit tests: `kubeflow/trainer/**/*_test.py` (no network calls allowed)
+- Use `pytest` as the testing framework
+- See `kubeflow/trainer/test/common.py` for fixtures and patterns
+
+**Test Quality Checklist:**
+- [ ] Tests fail when your new logic is broken
+- [ ] Happy path is covered
+- [ ] Edge cases and error conditions are tested
+- [ ] Use fixtures/mocks for external dependencies
+- [ ] Tests are deterministic (no flaky tests)
+
+```python
+def test_filter_completed_jobs():
+    """Test filtering completed jobs from a list."""
+    jobs = ["job-1", "job-2", "job-3"]
+    completed = {"job-1", "job-2"}
+    
+    result = filter_completed_jobs(jobs, completed)
+    
+    assert result == ["job-3"]
+    assert len(result) == 1
+```
+
+### 4. Security and Risk Assessment
+
+**Security Checklist:**
+- [ ] No `eval()`, `exec()`, or `pickle` on user-controlled input
+- [ ] Proper exception handling (no bare `except:`) and use descriptive error messages
+- [ ] Remove unreachable/commented code before committing
+- [ ] Ensure proper resource cleanup (file handles, connections)
+- [ ] No secrets in code, logs, or examples
+
+❌ **Bad:**
+```python
+def load_config(path):
+    with open(path) as f:
+        return eval(f.read())  # ⚠️ Never eval user input
+```
+
+✅ **Good:**
+```python
+import yaml
+
+def load_config(path: str) -> dict:
+    """Load configuration from YAML file."""
+    with open(path, 'r') as f:
+        return yaml.safe_load(f)
+```
+
+### 5. Documentation Standards
+
+**Use Google-style docstrings with Args section for all public functions.**
+
+❌ **Insufficient Documentation:**
+```python
+def submit_job(name, config):
+    """Submit a job."""
+```
+
+✅ **Complete Documentation:**
+```python
+def submit_job(name: str, config: dict, *, priority: str = "normal") -> str:
+    """Submit a training job with specified configuration.
+    
+    Args:
+        name: The job name identifier.
+        config: Job configuration dictionary.
+        priority: Job priority level ('low', 'normal', 'high').
+        
+    Returns:
+        Job ID string for tracking the submitted job.
+        
+    Raises:
+        InvalidConfigError: If the configuration is invalid.
+        ResourceUnavailableError: If required resources are not available.
+    """
+```
+
+**Documentation Guidelines:**
+- Types go in function signatures, NOT in docstrings
+- Focus on "why" rather than "what" in descriptions
+- Document all parameters, return values, and exceptions
+- Keep descriptions concise but clear
+- Use Pydantic v2 models in `kubeflow.trainer.types` for schemas
+
+### 6. Architectural Improvements
+
+**When you encounter code that could be improved, suggest better designs:**
+
+❌ **Poor Design:**
+```python
+def process_training(data, k8s_client, storage, logger):
+    # Function doing too many things
+    validated = validate_data(data)
+    job = k8s_client.create_job(validated)
+    storage.save_metadata(job)
+    logger.info(f"Created job {job.name}")
+    return job
+```
+
+✅ **Better Design:**
+```python
+@dataclass
+class TrainingJobResult:
+    """Result of training job submission."""
+    job_id: str
+    status: str
+    created_at: datetime
+    
+class TrainingJobManager:
+    """Handles training job lifecycle operations."""
+    
+    def __init__(self, k8s_client: KubernetesClient, storage: Storage):
+        self.k8s = k8s_client
+        self.storage = storage
+        
+    def submit_job(self, config: TrainingConfig) -> TrainingJobResult:
+        """Submit and track a new training job."""
+        validated_config = self._validate_config(config)
+        job = self._create_k8s_job(validated_config)
+        self._save_job_metadata(job)
+        return TrainingJobResult(
+            job_id=job.name,
+            status=job.status,
+            created_at=job.created_at
+        )
+```
+
+## Component: Trainer
+
+**Client entrypoints**: `kubeflow.trainer.api.TrainerClient` and trainer definitions such as `CustomTrainer`
+
+**Backends**:
+- `localprocess`: local execution for fast iteration
+- `kubernetes`: K8s-backed jobs, see `backends/kubernetes`
+
+**Typical flow**:
+1. Get runtime, define trainer, submit with `TrainerClient().train(...)`
+2. `wait_for_job_status(...)` then fetch logs with `get_job_logs(...)`
+3. For full example, see README "Run your first PyTorch distributed job"
+
+**Integration patterns**:
+- Follow existing patterns in `kubeflow.trainer.backends` for new backends
+- Use `kubeflow.trainer.types` for data models and type definitions
+- Implement proper error handling and resource cleanup
+- Include comprehensive tests for backend implementations
+
+## CI & PRs
+
+**PR Requirements**:
+- Title must follow Conventional Commits:
+  - Types: `chore`, `fix`, `feat`, `revert`
+  - Scopes: `ci`, `docs`, `examples`, `scripts`, `test`, `trainer`
+- CI runs `make verify` and tests on Python 3.9/3.11
+- Keep changes focused and minimal; align with existing style
+
+## Releasing
+
+**Version management**:
+```bash
+make release VERSION=X.Y.Z   # Updates kubeflow/__init__.py and generates changelog
+```
+- Do not commit secrets; verify coverage and lint pass before tagging
+
+## Troubleshooting
+
+- **`uv` not found**: run `make uv` or re-run `make install-dev`
+- **Ruff not installed**: `make install-dev` ensures tools; or `uv tool install ruff`
+- **Virtualenv issues**: remove `.venv` and re-run `make install-dev`
+- **Tests failing locally but not in CI**: run `make verify` to match CI formatting and lint rules
+
+## Quick Reference Checklist
+
+Before submitting code changes:
+
+- [ ] **Breaking Changes**: Verified no public API changes without deprecation
+- [ ] **Type Hints**: All functions have complete type annotations and return types
+- [ ] **Tests**: New functionality is fully tested with unit tests
+- [ ] **Security**: No dangerous patterns (eval, bare except, resource leaks, etc.)
+- [ ] **Documentation**: Google-style docstrings for public functions
+- [ ] **Code Quality**: `make verify` passes (lint and format)
+- [ ] **Architecture**: Suggested improvements where applicable
+- [ ] **Commit Message**: Follows Conventional Commits format
+
+## Guidance for AI Agents
+
+**Preferred commands**: use `uv run ...` to ensure tool consistency and `.venv` usage
+
+**Development workflow**:
+1. Read existing code patterns before making changes
+2. Follow the Core Development Principles above
+3. Run validation commands before proposing changes
+4. Use descriptive commit messages and PR descriptions
+
+**Validation before proposing changes**:
+- Lint/format: `make verify`
+- Tests: `make test-python` or targeted `pytest` invocations
+- Type checking: `uv run mypy kubeflow` (if available)
+
+**Commit/PR hygiene**:
+- Follow Conventional Commits in titles and messages
+- Include rationale ("why") in commit messages/PR descriptions
+- Do not push secrets or change git config
+- Scope discipline: only modify files relevant to the task; keep diffs minimal
+
+## Security & Privacy
+
+- No secrets in code, logs, or examples
+- Avoid external network calls in tests; prefer local fixtures/mocks
+- Validate inputs and raise specific exceptions
+
+## Community & Support
+
+- **Slack**: `#kubeflow-ml-experience`
+- **Meetings**: "Kubeflow SDK and ML Experience" (bi-weekly)
+- **Issues/Discussions**: https://github.com/kubeflow/sdk
+- **Contributing**: see CONTRIBUTING.md
+
+## References
+
+- **README**: high-level overview and example usage
+- **Makefile**: authoritative commands and targets (`make help`)
+- **Help**: `make help` lists available targets

From b4715209dbb54dfbf045d27a259e35d8b0e73ef0 Mon Sep 17 00:00:00 2001
From: Parthib Mukherjee <109328510+hawkaii@users.noreply.github.com>
Date: Tue, 30 Sep 2025 18:32:43 +0530
Subject: [PATCH 3/4] Remove introductory line from AGENTS.md

Removed introductory line about AGENTS guide.

Signed-off-by: Parthib Mukherjee <109328510+hawkaii@users.noreply.github.com>
---
 AGENTS.md | 2 --
 1 file changed, 2 deletions(-)

diff --git a/AGENTS.md b/AGENTS.md
index a945fcec0..9a326a56e 100644
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -1,5 +1,3 @@
-AGENTS: Guide for kubeflow/sdk
-
 ## Who This Is For
 
 - **AI agents**: Automate repository tasks with minimal context

From 3a8573527a1a65400b46fd9fecf88c0ec842c686 Mon Sep 17 00:00:00 2001
From: Parthib Mukherjee <parthibmukherjee@gmail.com>
Date: Sun, 19 Oct 2025 11:54:51 +0000
Subject: [PATCH 4/4] docs: Revise AGENTS.md to clarify agent behavior policies
 and development workflow

Signed-off-by: Parthib Mukherjee <parthibmukherjee@gmail.com>
---
 AGENTS.md | 139 ++++++++++++++++++++++++++++++++++--------------------
 1 file changed, 88 insertions(+), 51 deletions(-)

diff --git a/AGENTS.md b/AGENTS.md
index 9a326a56e..0c24b1f40 100644
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -4,28 +4,29 @@
 - **Contributors**: Humans using AI assistants or working directly
 - **Maintainers**: Ensure assistants follow project conventions and CI rules
 
-## What This Document Provides
+## Agent Behavior Policy
 
-- Environment setup and canonical commands for format, lint, and tests
-- Repository map and conventions to keep changes consistent
-- Guardrails for PRs, CI, and releases
-- Quick references for common tasks and troubleshooting
+AI agents should:
+- Make atomic, minimal, and reversible changes.
+- Prefer local analysis (`uv run`, `make verify`, `pytest`) before proposing commits.
+- NEVER modify configuration, CI/CD, or release automation unless explicitly requested.
+- Avoid non-deterministic code or random seeds without fixtures.
+- Use `AGENTS.md` and `Makefile` as the source of truth for development commands.
 
-## Project Overview
+Agents must NOT:
+- Bypass tests or linters
+- Introduce dependencies without updating `pyproject.toml`
+- Generate or commit large autogenerated files
 
-**Purpose**: Kubeflow SDK provides a unified Python SDK for AI practitioners to interact with multiple Kubeflow projects via consistent APIs, focusing on user workflows over infrastructure details.
 
-**Problem It Solves**: Reduces Kubernetes and multi-project complexity, offering simple, local-first Python interfaces for training, tuning, pipelines (planned), and model lifecycle management.
+### Context Awareness
 
-**Key Benefits**:
-- Unified experience across Kubeflow projects
-- Simplified AI workflows with minimal infrastructure knowledge
-- Local development support (install via `pip`) with optional cluster backends
+Before writing code, agents should:
+
+- Read docstrings and existing test cases for pattern alignment
+- Match import patterns from neighboring files
+- Preserve existing logging and error-handling conventionso
 
-**Today's Scope**:
-- **Available**: Kubeflow Trainer (train/fine-tune with different backends)
-- **Planned**: Katib (HPO), Pipelines (workflows), Model Registry
-- See README "Supported Kubeflow Projects" for current status
 
 ## Repository Map
 
@@ -51,6 +52,7 @@ Root files: AGENTS.md, README.md, pyproject.toml, Makefile, CI workflows
 
 ## Quick Start
 
+<!-- BEGIN: AGENT_COMMANDS -->
 **Setup**:
 ```bash
 make install-dev              # Install uv, create .venv, sync deps
@@ -86,6 +88,27 @@ uv run mypy kubeflow          # Run type checker
 uv run pre-commit install                    # Install hooks
 uv run pre-commit run --all-files           # Run all hooks
 ```
+<!-- END: AGENT_COMMANDS -->
+
+## Development Workflow for AI Agents
+
+**Preferred commands**: use `uv run ...` to ensure tool consistency and `.venv` usage
+
+**Before making changes**:
+1. Read existing code patterns and docstrings for alignment
+2. Follow the Core Development Principles below
+3. Run validation commands before proposing changes
+
+**Validation before proposing changes**:
+- Lint/format: `make verify`
+- Tests: `make test-python` or targeted `pytest` invocations
+- Type checking: `uv run mypy kubeflow` (if available)
+
+**Commit/PR hygiene**:
+- Follow Conventional Commits in titles and messages
+- Include rationale ("why") in commit messages/PR descriptions
+- Do not push secrets or change git config
+- Scope discipline: only modify files relevant to the task; keep diffs minimal
 
 ## Core Development Principles
 
@@ -153,6 +176,13 @@ def filter_completed_jobs(jobs: list[str], completed: set[str]) -> list[str]:
 - Unit tests: `kubeflow/trainer/**/*_test.py` (no network calls allowed)
 - Use `pytest` as the testing framework
 - See `kubeflow/trainer/test/common.py` for fixtures and patterns
+- Unit test structure must be consistent between each other (see `kubeflow/trainer/backends/kubernetes/backend_test.py` for reference)
+
+**Test Structure Pattern** (following `backend_test.py`):
+- Use `TestCase` dataclass for parametrized tests
+- Include `name`, `expected_status`, `config`, `expected_output/error` fields
+- Print test execution status for debugging
+- Handle both success and exception cases in the same test function
 
 **Test Quality Checklist:**
 - [ ] Tests fail when your new logic is broken
@@ -161,6 +191,9 @@ def filter_completed_jobs(jobs: list[str], completed: set[str]) -> list[str]:
 - [ ] Use fixtures/mocks for external dependencies
 - [ ] Tests are deterministic (no flaky tests)
 
+**Test Examples:**
+
+Simple test:
 ```python
 def test_filter_completed_jobs():
     """Test filtering completed jobs from a list."""
@@ -173,6 +206,31 @@ def test_filter_completed_jobs():
     assert len(result) == 1
 ```
 
+Parametrized test cases (preferred for multiple scenarios):
+```python
+@pytest.mark.parametrize(
+    "test_case",
+    [
+        TestCase(
+            name="valid flow with all defaults",
+            expected_status=SUCCESS,
+            config={"name": "job-1"},
+            expected_output=["job-1"],
+        ),
+        TestCase(
+            name="empty jobs list",
+            expected_status=SUCCESS,
+            config={"name": "empty"},
+            expected_output=[],
+        ),
+    ],
+)
+def test_filter_jobs_parametrized(test_case):
+    """Test job filtering with multiple scenarios."""
+    result = filter_jobs(**test_case.config)
+    assert result == test_case.expected_output
+```
+
 ### 4. Security and Risk Assessment
 
 **Security Checklist:**
@@ -282,6 +340,20 @@ class TrainingJobManager:
 
 **Client entrypoints**: `kubeflow.trainer.api.TrainerClient` and trainer definitions such as `CustomTrainer`
 
+**Trainer Types**:
+
+**CustomTrainer** (`kubeflow.trainer.types.CustomTrainer`):
+- **Purpose**: For custom, self-contained training functions that you write yourself
+- **Flexibility**: Complete control over the training process
+- **Use case**: "Bring your own training code" - maximum flexibility
+- **Key attributes**: `func` (your training function), `func_args`, `packages_to_install`, `pip_index_urls`, `num_nodes`, `resources_per_node`, `env`
+
+**BuiltinTrainer** (`kubeflow.trainer.types.BuiltinTrainer`):
+- **Purpose**: For pre-built training frameworks with existing fine-tuning logic
+- **Convenience**: Just configure parameters, training logic is already implemented
+- **Use case**: "Use our pre-built trainers" - convenience for common scenarios
+- **Key attributes**: `config` (currently only supports `TorchTuneConfig` for LLM fine-tuning with TorchTune)
+
 **Backends**:
 - `localprocess`: local execution for fast iteration
 - `kubernetes`: K8s-backed jobs, see `backends/kubernetes`
@@ -334,42 +406,7 @@ Before submitting code changes:
 - [ ] **Architecture**: Suggested improvements where applicable
 - [ ] **Commit Message**: Follows Conventional Commits format
 
-## Guidance for AI Agents
-
-**Preferred commands**: use `uv run ...` to ensure tool consistency and `.venv` usage
-
-**Development workflow**:
-1. Read existing code patterns before making changes
-2. Follow the Core Development Principles above
-3. Run validation commands before proposing changes
-4. Use descriptive commit messages and PR descriptions
-
-**Validation before proposing changes**:
-- Lint/format: `make verify`
-- Tests: `make test-python` or targeted `pytest` invocations
-- Type checking: `uv run mypy kubeflow` (if available)
-
-**Commit/PR hygiene**:
-- Follow Conventional Commits in titles and messages
-- Include rationale ("why") in commit messages/PR descriptions
-- Do not push secrets or change git config
-- Scope discipline: only modify files relevant to the task; keep diffs minimal
-
-## Security & Privacy
-
-- No secrets in code, logs, or examples
-- Avoid external network calls in tests; prefer local fixtures/mocks
-- Validate inputs and raise specific exceptions
-
 ## Community & Support
 
-- **Slack**: `#kubeflow-ml-experience`
-- **Meetings**: "Kubeflow SDK and ML Experience" (bi-weekly)
 - **Issues/Discussions**: https://github.com/kubeflow/sdk
 - **Contributing**: see CONTRIBUTING.md
-
-## References
-
-- **README**: high-level overview and example usage
-- **Makefile**: authoritative commands and targets (`make help`)
-- **Help**: `make help` lists available targets