Skip to content

Conversation

@vanpelt
Copy link
Collaborator

@vanpelt vanpelt commented Dec 21, 2025

Summary

This PR implements the core Claude Code plugin for Weave, building on the shared AG-UI protocol module.

Key Features

  • ClaudeParser: Converts Claude Code session data to AG-UI events
  • Daemon: Real-time tracing of Claude Code sessions with tool call streaming
  • Session Importer: Import historical sessions with full trace reconstruction
  • Comprehensive test coverage: 324+ tests covering daemon, session parsing, utilities

Recent Fixes

  • Fixed timestamp propagation in tool call logging (was showing 0.0s latency)
  • Cleaned up mid-function imports for better code organization
  • Removed unused code (~50 lines)

Stacked PRs

  • Base: feature/ag-ui-base-v3 (PR #5870)
  • This PR: Core plugin implementation
  • Next: feature/claude-plugin-cli-v3 → CLI tools

Test plan

  • All 324 tests pass
  • Lint checks pass
  • Manual testing with live Claude Code session

🤖 Generated with Claude Code

vanpelt and others added 24 commits December 21, 2025 17:31
Cherry-picked core infrastructure files from feature/claude-plugin-cli:

Core infrastructure:
- __init__.py, constants.py, utils.py, credentials.py

Core runtime:
- core/__init__.py, core/daemon.py, core/hook.py
- core/socket_client.py, core/state.py

Session processing:
- session/__init__.py, session/session_parser.py
- session/session_processor.py, session/session_importer.py
- session/session_title.py

Config and views:
- config.py
- views/__init__.py, views/cli_output.py, views/feedback.py

Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Update all imports in claude_plugin to use the shared ag_ui components:
- diff_view imports: claude_plugin.views.diff_view -> ag_ui.views.diff_view
- diff_utils imports: claude_plugin.views.diff_utils -> ag_ui.views.diff_utils
- secret_scanner imports: claude_plugin.secret_scanner -> ag_ui.secret_scanner

Modified files:
- weave/integrations/claude_plugin/__init__.py
- weave/integrations/claude_plugin/core/daemon.py
- weave/integrations/claude_plugin/session/session_importer.py
- weave/integrations/claude_plugin/session/session_processor.py
- weave/integrations/claude_plugin/utils.py
- weave/integrations/claude_plugin/views/__init__.py

This completes Task 3.3 from the ag_ui implementation plan.

Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Cherry-picked from feature/claude-plugin-cli and updated for ag_ui structure.

Changes:
- Added 14 test files covering daemon, session processing, config, views
- Removed test_secret_scanner.py (moved to ag_ui tests in Layer 1)
- Removed test_teleport.py (belongs in Layer 3 CLI)
- Updated test_diff_view.py imports to use ag_ui.views
- Fixed ag_ui/views/__init__.py to export actual functions from diff_utils

All 268 tests passing.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Implements the AgentEventParser protocol for Claude Code JSONL sessions,
converting them into standardized AG-UI events for tracing.

Features:
- Implements AgentEventParser protocol with agent_name, parse(), and parse_stream()
- Converts Session → AG-UI events in proper order:
  - RunStartedEvent with first user prompt
  - StepStartedEvent/StepFinishedEvent for each turn
  - TextMessage events for user/assistant messages
  - ToolCall events for tool executions
  - UsageRecordedEvent for token tracking
  - ThinkingContentEvent for extended thinking
  - RunFinishedEvent with session metadata
- Optional secret redaction using SecretScanner
- Handles empty sessions gracefully
- Async parse_stream() for real-time parsing
- Comprehensive test coverage (10 tests, all passing)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Enhances ClaudeParser to emit richer metadata and events:

1. StepStartedEvent now includes:
   - user_message: The user's prompt (redacted if secrets detected)
   - turn_number: The turn index (1-based)

2. StepFinishedEvent now detects pending questions:
   - Extracts assistant text ending with '?'
   - Applies secret redaction to pending questions

3. FileSnapshotEvents for file backups:
   - Emits FileSnapshotEvent for each file backup in a turn
   - Includes original file path, content, and backup metadata
   - Applies secret redaction to file contents

4. Added comprehensive tests:
   - Test for user_message in StepStartedEvent metadata
   - Test for pending_question detection (with and without questions)
   - Updated imports to include FileSnapshotEvent

All tests passing (12/12).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
…create_subagent_call

- Move build_subagent_inputs and build_subagent_output helper functions
  from SessionProcessor to session_importer as module-level functions
- Add json import and truncate import needed by the moved functions
- Update _create_subagent_call to use local helper functions instead
  of SessionProcessor.build_subagent_inputs/build_subagent_output
- Replace SessionProcessor-based _import_session_to_weave with new
  event-based approach using ClaudeParser + AgentTraceBuilder
- Update tests to work with new event-based trace building

This removes the circular dependency between session_importer and
session_processor, making session_importer fully independent.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Move _build_turn_output helper to session_importer.py
- Import helper functions from session_importer instead of SessionProcessor
- Inline create_session_call, create_turn_call logic into daemon.py
- Update tests to verify weave_client.create_call() instead of processor methods

This removes the SessionProcessor class dependency from daemon.py,
preparing for SessionProcessor deletion.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Move get_hostname() to utils.py
- Update daemon.py import to use utils.get_hostname
- Remove SessionProcessor from session/__init__.py exports
- Rename test_session_processor.py to test_trace_helpers.py
- Update tests to use session_importer._build_* functions
- Delete session_processor.py (822 lines removed)

SessionProcessor is no longer needed since:
- Static methods moved to session_importer.py (Tasks 9, 10)
- Instance methods inlined into daemon.py (Task 10)
- Batch import uses ClaudeParser + AgentTraceBuilder (Task 8)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
The Claude plugin hook runs `python -m weave.integrations.claude_plugin`
which requires __main__.py to exist. This was in the CLI layer but is
needed for core hook functionality.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Branch context for 3 stacked PRs (ag-ui-base, claude-plugin-core, cli)
- Phase 1: Remove ~276 lines of dead code (6 unused functions, 1 class)
- Phase 2: Add ~29 tests for P0-P3 critical paths
- Target: Improve coverage from 55% to 66%+
Removed truly dead code:
- is_command_output() - never called
- extract_command_output() - never called
- _COMMAND_OUTPUT_PATTERN regex - only used by removed functions

Note: Other functions identified in plan (extract_slash_command,
extract_xml_tag_content, sanitize_tool_input, get_tool_display_name)
are actually used and were preserved.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Removed dead dataclass that was never instantiated or used.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Fix state.py __exit__ parameters (prefix with _)
- Fix parser.py from_line parameter (prefix with _ and add TODO comment)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Add comprehensive tests for WeaveDaemon session continuation detection
covering both _check_session_continuation and _create_continuation_session_call
methods (daemon.py lines 675-827).

Tests cover:
- State-based continuation detection (session_ended flag)
- API-based fallback detection (ended_at field)
- Continuation call creation with proper naming ("Continued: " prefix)
- Counter incrementation across multiple continuations
- Edge cases (no session data, API errors, no weave client)

All 9 tests pass. Implements Task 5 from coverage cleanup plan.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Implement comprehensive tests for WeaveDaemon._handle_session_end method
covering:
- Compaction count tracking in session summary
- File snapshots capture from changed files
- Git metadata (branch/commit) capture for teleport feature
- Redacted secrets count tracking
- Session lifecycle management (running state, turn finishing)
- Session file processing

All tests use proper mocking to isolate the method under test and verify
the correct data is passed to weave client finish_call.

Part of Task 6 from coverage cleanup plan.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Add comprehensive tests for tool call logging with parallel grouping:
- test_parallel_tool_calls_grouped: Verifies parallel wrapper created for concurrent tools
- test_single_tool_call_not_grouped: Verifies single tools logged directly
- test_edit_tool_with_diff_view: Verifies Edit tool logging
- test_parallel_grouping_threshold: Verifies 1000ms grouping threshold

Tests validate daemon._log_pending_tool_calls_grouped() behavior:
- Tools with same timestamp are grouped under claude_code.parallel wrapper
- Single tools are logged directly without wrapper
- Tools >1000ms apart create separate groups

Implements Task 7 from coverage cleanup plan.

Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Add comprehensive tests for skill and Q&A handling in daemon:

Skill Handling (TestSkillHandling):
- test_skill_expansion_detected: Verifies Skill tool expansion creates skill call with correct tool_name and inputs
- test_slash_command_vs_skill: Verifies SlashCommand is distinguished from Skill (different tool_name and input format)

Q&A Tracking (TestQATracking):
- test_question_extraction: Verifies single question answer is parsed correctly from tool_result
- test_multiple_questions_parsed: Verifies multiple Q&A pairs are parsed and matched correctly
- test_no_question_when_not_pending: Verifies no action when tool_use_id not in pending questions

Tests validate existing daemon.py implementation (lines 2115-2299):
- _handle_skill_expansion: Logs Skill/SlashCommand calls with expansion content
- _finish_question_call: Parses answers from tool_result and finishes question call

All tests pass, confirming correct functionality.

Implements Task 8 from docs/plans/2025-12-20-claude-plugin-coverage-cleanup.md

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Cherry-picked from feature/weave-retroactive-logging branch.

Adds started_at and ended_at parameters to:
- weave.log_call() for retroactive call logging
- client.create_call() for custom start timestamps
- client.finish_call() for custom end timestamps

This enables the claude_plugin daemon to log subagent calls and
tool calls with their original timestamps during session import
and live streaming, which is critical for accurate trace timelines.

Fixes TypeError: log_call() got an unexpected keyword argument 'started_at'

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Add TestSubagentImport class with 5 comprehensive tests
- Test _create_subagent_call creates proper call structure
- Test subagent display name uses first user prompt
- Test missing agent files are handled gracefully
- Test tool calls are embedded in subagent output
- Test subagent_type parameter is passed to inputs
- All tests verify the function works with agent-{id}.jsonl files

These tests cover the critical functionality of importing subagent
sessions spawned by the Task tool, ensuring proper call creation,
display naming, tool call embedding, and error handling.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Add comprehensive tests for file snapshot collection during session import,
covering both file backups from file-history and Write tool calls.

Tests added:
- File backup loading with correct mimetype detection
- Multiple file extensions with appropriate mimetypes
- Missing file handling
- Metadata preservation in Content objects
- Write tool file snapshot collection
- Deduplication of file snapshots by path

Bug fixes:
- Fix session_parser to collect tool_result messages in raw_messages
  (added else clause to handle non-user/non-assistant message types)
- Fix Content object creation in session_importer to use Content.from_bytes()
  instead of direct constructor call (which is not supported)

These changes ensure file snapshots are properly collected and attached to
session traces during import, enabling file viewing in the Weave UI.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Add TestConfigErrors class with 4 new tests
- Test JSON decode error handling for local settings
- Test file write error handling for local settings
- Test JSON decode error handling for global config
- Test file write error handling for global config
- All tests pass, validating existing error handling

Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Add comprehensive tests for truncate() and generate_session_name() error handling:
- Test truncate() with None input, empty string, and edge cases
- Test generate_session_name() fallback behavior when Claude API and Ollama fail
- Test all code paths: Claude success, Ollama success, and final fallback

All 53 tests in test_utils.py now pass.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Fixed 7 locations in daemon.py where log_tool_call wasn't passing
  started_at/ended_at timestamps, causing 0.0s latency in Weave UI
- Added 4 tests for timestamp propagation in test_utils.py
- Fixed import ordering and removed unused imports
- Moved mid-function imports to top-level in session_importer.py

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Plans should be stored in ~/.claude/plans/ instead of being
committed to the repository.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@vanpelt vanpelt force-pushed the feature/claude-plugin-core-v3 branch from d7a47a5 to 4efd880 Compare December 22, 2025 01:31
@github-actions
Copy link
Contributor

❌ Documentation Reference Check Failed

No documentation reference found in the PR description. Please add either:

This check is required for all PRs except those that start with "chore(weave)" or explicitly state "docs are not required". Please update your PR description and this check will run again automatically.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants