Skip to content

Conversation

@vanpelt
Copy link
Collaborator

@vanpelt vanpelt commented Dec 21, 2025

Summary

This PR introduces the weave/integrations/ag_ui/ module - a shared abstraction layer for tracing agentic coding tools based on the AG-UI protocol.

Key Features

  • Event types based on AG-UI protocol for standardizing agentic tool communication
  • AgentEventParser protocol for agent-specific implementations
  • AgentTraceBuilder for converting events to Weave traces with proper parent-child relationships
  • Tool registries for agent-specific behaviors (Claude, future: Gemini, Codex)
  • Shared diff views for file changes visualization
  • Secret scanner for redacting sensitive information

Why AG-UI?

The AG-UI protocol provides a standardized way to communicate between agentic tools and observability systems. By building on this protocol, we can:

  • Support multiple agents (Claude Code, Gemini CLI, Codex CLI) with shared infrastructure
  • Provide consistent tracing and visualization across agents
  • Enable future protocol compatibility

Stacked PRs

This is the base branch. Subsequent PRs build on this:

  • feature/claude-plugin-core-v3 → Core Claude plugin implementation
  • feature/claude-plugin-cli-v3 → CLI tools (teleport, etc.)

Test plan

  • All existing tests pass
  • New module imports correctly
  • Lint checks pass

🤖 Generated with Claude Code

vanpelt and others added 15 commits December 20, 2025 20:49
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Add parser protocol that defines the interface for agent-specific parsers to convert native log formats to AG-UI events.

- Created AgentEventParser protocol with parse() and parse_stream() methods
- Added support for secret redaction via redact_secrets parameter
- Includes comprehensive tests for protocol conformance
- Updated ag_ui __init__.py to export AgentEventParser

This follows TDD - tests written first, verified to fail, then implementation added.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Cherry-picked diff view files from feature/claude-plugin-cli and moved them to
ag_ui/views/ to make them reusable across agent integrations.

Changes:
- Moved diff_view.py and diff_utils.py from claude_plugin/views to ag_ui/views
- Updated internal imports to use new ag_ui.views location
- Created ag_ui/views/__init__.py with public API exports

The TYPE_CHECKING imports from claude_plugin.session.session_parser remain
because these views still work with Claude-specific session types.

All ag_ui tests pass (21/21).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Cherry-picked secret_scanner.py from feature/claude-plugin-cli and moved
it to weave/integrations/ag_ui/ to make it available as a shared
component for all agent integrations.

Changes:
- Moved weave/integrations/claude_plugin/secret_scanner.py to ag_ui/
- Moved tests/integrations/claude_plugin/test_secret_scanner.py to ag_ui/
- Updated test imports to use weave.integrations.ag_ui.secret_scanner
- Added SecretScanner to ag_ui/__init__.py exports

All tests pass (19/19).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Brings AgentTraceBuilder to feature parity with SessionProcessor:

**Key Enhancements:**

1. **ToolCallArgsEvent handling**: Store and use tool arguments from
   ToolCallArgsEvent when creating tool calls

2. **UsageRecordedEvent handling**: Aggregate token usage by model and
   attach to step calls with proper summary

3. **Tool call logging**: Use log_tool_call from claude_plugin/utils for:
   - Proper display names
   - Input/output truncation
   - Duration tracking
   - Error handling
   - Diff view generation (via hooks)

4. **Text & thinking content**: Accumulate streaming text and thinking
   content from TextMessageContentEvent and ThinkingContentEvent

5. **process_events() convenience method**: Process iterators of events

6. **Session/step output population**: Include aggregated content,
   reasoning, and usage in outputs

**Implementation Details:**

- Track message-to-step mappings for proper aggregation
- Use message_id as primary key (not step_id) per AG-UI protocol
- Support streaming via delta field in TextMessageContentEvent
- Clean up state after step finishes
- Preserve hook system for extensibility (diff views, etc.)

**Testing:**

- Added comprehensive tests for all new functionality
- Test ToolCallArgsEvent storage and usage
- Test usage aggregation across multiple messages
- Test text content accumulation (streaming)
- Test thinking content as reasoning_content
- Test process_events convenience method
- All 9 tests passing

Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Task 2: Add ChatView-compatible output format to AgentTraceBuilder

Changes:
1. Added ChatView-compatible message format to step output:
   - role: "assistant"
   - model: First model from usage events
   - content: Aggregated text response
   - reasoning_content: Thinking content (for collapsible UI)
   - tool_calls: Tool calls in OpenAI format with embedded results

2. Removed circular dependency on log_tool_call:
   - Inlined tool call logging logic directly in AgentTraceBuilder
   - No longer imports from claude_plugin.utils
   - ag_ui module is now fully independent

3. Added comprehensive tests for ChatView format:
   - Test role and model inclusion
   - Test tool_calls in OpenAI format with embedded results
   - Test multiple tool calls in a step
   - Test reasoning_content inclusion
   - Test edge cases (tool calls without current step)

All tests pass successfully.

Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Enhance FileSnapshotEvent and trace builder to support file snapshot tracking:
- Add content, mimetype, and is_backup fields to FileSnapshotEvent
- Track file snapshots per step in trace builder
- Convert FileSnapshotEvent to Content objects
- Include file_snapshots in step output
- Add comprehensive tests for file snapshot functionality

This enables tracking of file state before/after edits for diff views and backup tracking.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Add support for integration-specific behavior after a step finishes.
The hook allows integrations like claude_plugin to attach custom views
(e.g., diff views) to finished step calls.

Changes:
- Added on_step_finished parameter to AgentTraceBuilder.__init__
- Invoked hook after finishing step call in _handle_step_finished
- Added 5 comprehensive tests for hook functionality

The hook signature: Callable[[StepFinishedEvent, Any], None]
- Called with the event and the finished call object
- Called after client.finish_call completes
- Optional parameter (defaults to None)

Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Implements Q&A flow context tracking to show conversation continuity when
assistant asks questions and user responds.

Changes:
- Add _pending_question state tracking to AgentTraceBuilder
- In _handle_step_started: add Q&A context (messages + in_response_to) to step
  inputs when a pending question exists from previous step
- In _handle_step_finished: detect questions from assistant text using
  _detect_question helper, store for next step's context
- Add _detect_question method for simple heuristic question detection
- Clear pending question when step doesn't end with a question

Tests:
- test_pending_question_detected_from_text: verify detection from text ending with '?'
- test_explicit_pending_question_from_event: verify explicit pending_question from event
- test_qa_context_added_to_next_step: verify Q&A context in next step inputs
- test_pending_question_cleared_when_no_question: verify clearing after non-question
- test_detect_question_method: test question detection heuristics
- test_qa_flow_with_multiple_steps: integration test for complete Q&A flow

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
…info

Implement Task 6: Session-level summary and git info tracking

Changes:
- Add state tracking for run metadata, step counts, and tool counts
- Store metadata from RunStartedEvent for later summary generation
- Track step-to-run mapping to associate tool calls with runs
- Increment step count in _handle_step_started
- Increment tool count in _handle_tool_result
- Build comprehensive summary in _handle_run_finished with:
  - turn_count (from step count)
  - tool_call_count
  - model (from event metadata)
  - git_info (if cwd in metadata)
- Implement _get_git_info() to extract repo, branch, commit_hash
- Clean up run metadata after run finishes
- Add comprehensive test coverage for all summary features

Tests:
- test_run_finished_sets_summary_with_turn_count
- test_run_finished_sets_summary_with_tool_count
- test_run_finished_sets_summary_with_model
- test_run_finished_sets_summary_with_git_info
- test_run_finished_comprehensive_summary
- test_run_metadata_cleanup

All 77 ag_ui tests passing.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Fixed docstring formatting (D212)
- Fixed import ordering (I001)
- Sorted __all__ lists (RUF022)
- Added ClassVar annotations for mutable class attrs (RUF012)
- Converted Union[...] to | syntax (UP007)
- Fixed try/except flow (TRY300)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Removed apply_edit_operation and compute_unified_diff which don't
exist in diff_utils.py - they were leftover from planned features.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Plans should be stored in ~/.claude/plans/ instead of being
committed to the repository.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@github-actions
Copy link
Contributor

❌ Documentation Reference Check Failed

No documentation reference found in the PR description. Please add either:

This check is required for all PRs except those that start with "chore(weave)" or explicitly state "docs are not required". Please update your PR description and this check will run again automatically.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants