Skip to content

Conversation

@gschier
Copy link
Member

@gschier gschier commented Dec 22, 2025

CleanShot 2025-12-22 at 08 11 58@2x

- Create separate blobs.sqlite database for large body storage
- Add body_chunks table with body_id, chunk_index, and data columns
- Implement BlobManager and BlobContext with core CRUD methods
- Add BlobManagerExt trait for app_handle.blobs() access
- Support streaming writes with configurable chunk sizes
- Add TeeReader to fork body streams while sending
- Wrap request body with TeeReader in execute_transaction
- Buffer and write 1MB chunks to blob DB with body_id format {response_id}.request
- Support both Bytes and Stream body types
@gschier
Copy link
Member Author

gschier commented Dec 22, 2025

@claude review this please

@gschier
Copy link
Member Author

gschier commented Dec 28, 2025

@claude review

@gschier
Copy link
Member Author

gschier commented Dec 28, 2025

@claude review

@claude
Copy link

claude bot commented Dec 28, 2025

Claude finished @gschier's task in 1m 46s —— View job


Code Review Complete

I've reviewed this PR which adds the ability to store and display request bodies in the UI. Overall, this is a solid implementation with good architecture. Here are my findings:


🎯 Architecture & Design

Strengths:

  • Clean separation of concerns: New blob storage database (blobs.sqlite) separate from main database is a good design choice
  • TeeReader pattern: Elegant solution for capturing request body while streaming (src-tauri/yaak-http/src/tee_reader.rs:7-43)
  • Chunked storage: 1MB chunks for large request bodies prevents memory issues (src-tauri/src/http_request.rs:40)
  • Comprehensive testing: Excellent test coverage in tee_reader.rs and blob_manager.rs

🐛 Potential Issues

1. Race Condition Risk (Medium Priority)

Location: src-tauri/src/http_request.rs:541-604

The write_stream_chunks_to_db function updates the response's request_content_length after streaming completes, but there's a potential race where the UI might query the body before this final update occurs. The function also doesn't handle the case where the response might be deleted while streaming.

// At line 594-601, final update might race with UI queries
app_handle.with_tx(|tx| {
    debug!("Updating final body length {total_bytes}");
    if let Ok(mut response) = tx.get_http_response(&response_id) {
        response.request_content_length = Some(total_bytes as i32);
        tx.update_http_response_if_id(&response, update_source)?;
    }
    Ok(())
})?;

Recommendation: Consider using optimistic locking or ensuring the UI properly handles the case where request_content_length might be None during streaming.


2. Database Lock Contention (Low-Medium Priority)

Location: src-tauri/yaak-models/src/blob_manager.rs:58-63

The connect() method uses .expect() when acquiring the mutex lock, which will panic if the lock is poisoned. During high-throughput scenarios (large uploads/downloads), this could cause issues.

pub fn connect(&self) -> BlobContext {
    let conn = self
        .pool
        .lock()
        .expect("Failed to gain lock on blob DB")  // ⚠️ Panics on poisoned lock
        .get()
        .expect("Failed to get blob DB connection from pool");
    BlobContext { conn }
}

Recommendation: Use unwrap_or_else or proper error handling instead of expect() for production code.


3. Missing Cleanup on Error Paths (Low Priority)

Location: src-tauri/src/http_request.rs:358-405

If the request fails after writing some chunks to the blob DB (e.g., network error), the partial chunks aren't cleaned up. This could lead to orphaned blob data over time.

Recommendation: Add cleanup logic in error handlers to remove partial chunk data when requests fail.


4. Integer Overflow Concern (Low Priority)

Location: src-tauri/src/http_request.rs:535

Content length is stored as i32, which limits request bodies to ~2GB. While this is probably fine for most use cases, it could be an issue for large file uploads.

r.request_content_length = Some(data.len() as i32);

Recommendation: Consider using i64 for content length fields to support larger bodies, or add validation to reject bodies > 2GB with a clear error message.


5. Potential Memory Issue in Frontend (Low Priority)

Location: src-web/hooks/useHttpRequestBody.ts:29-30

The entire request body is loaded into memory as a Uint8Array on the frontend, which could cause issues with very large request bodies (hundreds of MBs).

const body = new Uint8Array(data);
const bodyText = new TextDecoder('utf-8', { fatal: false }).decode(body);

Recommendation: The ConfirmLargeResponseRequest component (2MB threshold) helps, but consider streaming or pagination for truly large bodies.


💡 Code Quality Observations

Good Practices:

  • ✅ Comprehensive error handling in most paths
  • ✅ Well-structured tests with edge cases covered
  • ✅ Proper use of async/await and tokio primitives
  • ✅ Database migrations are clean and include proper indexes
  • ✅ UI components properly use React Query for caching

Minor Issues:

  1. Unused Error Return (src-tauri/src/lib.rs:808): The error handling was changed from returning an error to returning an empty string, but there's no comment explaining why this change was made. Consider adding a comment for future maintainers.

  2. Magic Number (src-web/components/ConfirmLargeResponseRequest.tsx:19): The 2MB threshold for large response warnings could be extracted to a constant.

  3. Missing Index on created_at (blob_migrations/00000000000000_init.sql:12): If you ever need to clean up old blobs, a created_at index would be helpful, though the current query patterns don't require it.


🔒 Security Considerations

  • ✅ No SQL injection risks (properly using parameterized queries)
  • ✅ No XSS risks in UI components
  • ⚠️ Disk Space Exhaustion: No apparent limits on total blob storage size. Consider implementing periodic cleanup of old response bodies or a maximum storage limit.
  • ✅ Proper cleanup on response deletion (http_responses.rs:72-76)

🧪 Testing

Well Tested:

  • TeeReader has excellent test coverage including edge cases (empty data, receiver dropped, large data)
  • BlobManager has comprehensive tests for all CRUD operations
  • Good coverage of chunked reads and ordering

Could Be Improved:

  • Consider adding integration tests for the full request body capture flow
  • Test error scenarios (disk full, database locked, etc.)

📊 Performance Notes

  • Chunking strategy (1MB) is reasonable
  • Blob database pool size (50 connections) seems appropriate
  • Using unbounded channels for chunk streaming could theoretically cause memory issues if the producer is much faster than the consumer, but in practice the HTTP client will backpressure

Summary

This is a well-implemented feature with good architecture and testing. The main concerns are around edge cases (race conditions, error cleanup, large body handling) rather than fundamental design issues.

Recommended actions before merge:

  1. Consider addressing the race condition in write_stream_chunks_to_db
  2. Replace .expect() calls with proper error handling in production paths
  3. Add cleanup for orphaned blobs on error paths
  4. Consider adding a comment explaining the empty string return in cmd_http_response_body

The current implementation is functional and safe for most use cases, but the above improvements would make it more robust for production use.


@claude
Copy link

claude bot commented Dec 28, 2025

Code Review: PR #327 - Store and Show Request Body in UI

Overview

This PR adds functionality to capture, store, and display HTTP request bodies. The implementation introduces a blob storage system, TeeReader for streaming capture, and UI components. Overall, the architecture is solid with good separation of concerns, but there are several critical issues that need addressing.


Critical Issues (Must Fix Before Merge)

1. Unbounded Channel Growth - Memory Exhaustion Risk

Location: src-tauri/src/http_request.rs:368, 338

Issue: Unbounded channels can grow indefinitely if the database writer is slower than the HTTP client, leading to memory exhaustion on large request bodies.

Fix: Use bounded channels with backpressure instead of unbounded_channel()

2. Excessive Database Updates - Performance Issue

Location: src-tauri/src/http_request.rs:492-495

Issue: Database updates occur for every 8KB chunk read. For a 100MB response, this is ~12,800 DB updates, causing severe performance degradation.

Fix: Throttle updates to every 1MB or similar interval

3. Integer Overflow on Large Bodies

Location: src-tauri/src/http_request.rs:535, 597

Issue: i32 max is ~2.1GB. Bodies larger than this will silently overflow.

Fix: Use i64 and update the database schema.

4. No Size Limits - DoS/Storage Exhaustion Risk

Issue: No validation prevents storing arbitrarily large request bodies, leading to potential disk exhaustion.

Fix: Add configurable max size (e.g., 100MB default)

5. Silent Error Handling in Blob Writing

Location: src-tauri/src/http_request.rs:390

Issue: Errors are only logged. Users see requests as successful even when body storage fails, leading to data loss.

Fix: Store error state in response or surface to UI


High Priority Issues

6. Task Leak on Early Errors

Location: src-tauri/src/http_request.rs:379-392

Issue: If execute_transaction fails early, the spawned write_stream_chunks_to_db task continues waiting indefinitely.

Fix: Add cancellation mechanism using oneshot channels or CancellationToken.

7. Orphaned Ephemeral Response Files

Location: src-tauri/src/http_request.rs:419-423

Issue: Ephemeral responses generate UUID-based file paths that are never cleaned up.

Fix: Track ephemeral files and clean up on shutdown or periodically.

8. Orphaned Chunks on Partial Failures

Issue: If chunk writing fails mid-stream, partial data remains in the database.

Fix: Add cleanup on error

9. Memory Inefficiency - Double Buffering

Location: src-tauri/src/http_request.rs:362

Issue: Full copy of request body created in memory. 100MB request = 200MB memory usage.

Fix: Use Bytes::slice() or consume and reconstruct.

10. Race Condition in Stream Completion Update

Location: src-tauri/src/http_request.rs:594-601

Issue: Silent failure if response was deleted. Should log or error.


Positive Aspects

  1. Excellent architecture - ResponseContext abstraction cleanly separates persisted vs ephemeral responses
  2. Comprehensive tests - TeeReader has 5 test cases covering edge cases
  3. Proper cleanup - delete_http_response includes blob deletion
  4. Good async patterns - Proper use of channels and spawned tasks
  5. Blob manager well-designed - Good indexing, UNIQUE constraints, timestamps

Suggestions for Future Improvements

  1. Batch DB inserts - Insert chunks in transactions for better performance
  2. Add ResponseContext tests - Critical component has no tests
  3. UI confirmation for large bodies - Warn before loading 100MB into browser
  4. Shutdown task tracking - Ensure all spawned tasks complete on shutdown
  5. Better error propagation - Don't silently ignore update errors (line 141, 148)

Test Coverage Needed

  • ResponseContext (persisted vs ephemeral behavior)
  • Stream chunk writing failures and cleanup
  • Very large bodies (>100MB)
  • Concurrent operations
  • Cancellation during body write
  • Edge cases: empty bodies, exact chunk boundaries

Summary

This is a well-architected feature with solid fundamentals, but the critical issues around resource management, performance, and error handling must be addressed before merge. The unbounded channels and excessive DB updates are particularly concerning for production use.

Recommendation: Address the 5 critical issues, then consider the high-priority improvements in a follow-up PR.

Great work on the overall design! The TeeReader implementation is particularly clean.

@claude
Copy link

claude bot commented Dec 28, 2025

Pull Request Review: Store and Show Request Body in UI

Summary

This PR implements request body storage and visualization in the UI. The implementation uses the blob storage system to capture request bodies as they're sent, stores them in chunks, and displays them in a new UI component with performance safeguards for large bodies.

Code Quality & Architecture ✅

Strengths:

  • Excellent separation of concerns: ResponseContext pattern cleanly abstracts persisted vs ephemeral response handling (http_request.rs:42-86)
  • Well-tested components: TeeReader has comprehensive test coverage including edge cases (empty data, dropped receivers, large data, backpressure)
  • Consistent with existing patterns: Uses the established blob storage system and follows existing query patterns
  • Good documentation: Clear comments explaining backpressure mechanisms and design choices

Areas for improvement:

1. Potential Backpressure Deadlock in TeeReader (Critical)

Location: src-tauri/yaak-http/src/tee_reader.rs:38-44

The backpressure implementation in TeeReader has a potential issue:

Err(mpsc::error::TrySendError::Full(_)) => {
    cx.waker().wake_by_ref();
    return Poll::Pending;
}

When the channel is full, this wakes the waker immediately and returns Pending, which could cause a busy-wait loop consuming CPU. The waker will be polled again immediately without the channel having been drained.

Recommendation: Consider using send().await instead of try_send() to properly block until space is available, or implement a more sophisticated backpressure mechanism with separate task coordination.

2. Error Handling in Stream Chunk Writing

Location: src-tauri/src/http_request.rs:342-369

tauri::async_runtime::spawn(async move {
    if let Err(e) = write_stream_chunks_to_db(...).await {
        error!("Error writing stream chunks to DB: {}", e);
    };
});

Errors in the background task are only logged. If blob insertion fails, the request_content_length will never be updated, leaving incomplete data.

Recommendation: Consider propagating errors back to the main response or setting an error flag on the response to indicate partial data capture.

3. Unbounded Memory in Buffer Accumulation

Location: src-tauri/src/http_request.rs:544-577

let mut buffer = Vec::with_capacity(REQUEST_BODY_CHUNK_SIZE);
// ...
buffer.extend_from_slice(&data);

While the buffer has an initial capacity, if many small chunks arrive rapidly before reaching REQUEST_BODY_CHUNK_SIZE, the buffer can grow unbounded.

Recommendation: Add a maximum buffer size check or flush timeout to prevent memory issues with pathological input patterns.

4. Race Condition in Final Size Update

Location: src-tauri/src/http_request.rs:605-613

The final size update happens in a background task after streaming completes:

if let Ok(mut response) = tx.get_http_response(&response_id) {
    response.request_content_length = Some(total_bytes as i32);
    tx.update_http_response_if_id(&response, update_source)?;
}

There's a potential race where the UI might read the response before this update completes, showing incomplete size information.

Recommendation: Consider using atomic updates or ensuring this completes before marking the response as closed.

Performance Considerations ⚠️

Positives:

  • Throttled DB updates (100ms interval) prevent excessive writes during streaming (http_request.rs:500-505)
  • Bounded channels provide backpressure (buffer size 10 chunks ~10MB for body, 100 events for events)
  • Large body warning in UI prevents performance issues (ConfirmLargeResponseRequest.tsx:19-29)
  • Chunked storage avoids loading entire bodies into memory

Concerns:

  1. Synchronous blob writes for byte bodies (http_request.rs:371):

    write_bytes_to_db_sync(response_ctx, &body_id, bytes.clone())?;

    This clones the entire body and writes synchronously, which could block the HTTP transaction for large bodies. Consider async write or background task.

  2. Integer overflow potential: Request sizes are stored as i32 (max ~2GB). Large file uploads could overflow:

    r.request_content_length = Some(total_bytes as i32);  // Unchecked cast

    Recommendation: Use i64 or add overflow checks.

  3. No cleanup for failed requests: If a request fails mid-stream, orphaned chunks may remain in the blob database.

Security Concerns 🔒

Low Risk Issues:

  1. SQL Injection Protection: ✅ Uses parameterized queries throughout
  2. Path Traversal: ✅ Body IDs are generated with controlled format {response_id}.request
  3. DoS via large bodies: ⚠️ Integer overflow could cause issues (see above)

Recommendation:

Add a maximum body size limit configuration to prevent malicious/accidental resource exhaustion.

Test Coverage 📊

Excellent:

  • TeeReader has comprehensive unit tests (5 test cases covering edge cases)
  • BlobManager has thorough test coverage (11 test cases including large data, ordering, deletion)

Missing:

  • Integration tests for the full request body capture flow
  • Tests for error scenarios (DB write failures, channel closures during streaming)
  • Tests for ephemeral vs persisted response handling differences
  • Tests for the backpressure behavior under load

Recommendation: Add integration tests exercising the full flow from request send → body capture → UI display.

Additional Observations

Database Schema ✅

  • Clean migration adding request_content_length column
  • Blob storage schema is well-indexed with composite index on (body_id, chunk_index)
  • Unique constraint prevents duplicate chunks

Frontend Implementation ✅

  • Proper React Query usage with appropriate cache keys
  • Loading states handled correctly
  • Good UX with large body warning and copy functionality
  • Key-based remounting ensures fresh data on response change

Minor Issues:

  1. Inconsistent error handling in cmd_http_response_body vs cmd_http_request_body:

    • cmd_http_response_body returns error for missing path (lib.rs:788)
    • cmd_http_request_body returns Ok(None) for empty chunks (lib.rs:817)

    Consider consistent pattern for both.

  2. Unused import in http_request.rs:

    use tokio_util::bytes::Bytes;  // Only used for function parameter type
  3. Magic number: LARGE_BYTES = 2 * 1000 * 1000 in ConfirmLargeResponseRequest.tsx:19
    Consider moving to a shared constants file.

Verdict

Overall Assessment: Approve with Recommendations

This is a well-architected feature that fits nicely into the existing codebase. The blob storage approach is appropriate for this use case, and the code follows established patterns.

Priority fixes before merge:

  1. 🔴 HIGH: Address integer overflow risk (use i64 for content lengths)
  2. 🟡 MEDIUM: Review TeeReader backpressure mechanism to prevent busy-wait
  3. 🟡 MEDIUM: Add error propagation for blob write failures

Post-merge improvements:

  • Add integration tests for full flow
  • Consider async blob writes for large byte bodies
  • Add cleanup for orphaned chunks
  • Add configurable max body size limit

Great work on maintaining code quality and test coverage! The feature is production-ready with the high-priority fixes.

@claude
Copy link

claude bot commented Dec 28, 2025

Pull Request Review: Store and Show Request Body in UI

Summary

This PR implements a significant feature that captures and stores HTTP request bodies in a separate blob database. The implementation is well-architected with proper separation of concerns.


Code Quality & Architecture ✅

Strengths:

  • Excellent separation of concerns with BlobManager abstraction
  • Well-tested core functionality (11+ unit tests in blob_manager.rs)
  • Clean TeeReader implementation with proper AsyncRead and backpressure
  • Proper resource cleanup on response deletion
  • Good type safety with consistent i64 usage for content lengths

Suggestions:

  1. Consider extracting magic numbers to constants
  2. Add documentation explaining persisted vs ephemeral responses

Potential Bugs & Issues ⚠️

High Priority

1. TeeReader backpressure busy-wait (tee_reader.rs:38-44):
When channel is full, code calls wake_by_ref() and returns Poll::Pending immediately, creating a busy-wait loop. Consider using proper async wait or blocking send.

2. Silent error swallowing (http_request.rs:348):
Event insertion errors are ignored with let _. Could lead to missing timeline data. Recommend logging at warning level.

Medium Priority

3. Race condition in stream chunk writing (http_request.rs:355-390):
Spawned task operates independently - if request fails/cancels, final request_content_length update might not occur.

4. Unbounded memory growth (http_request.rs:394-401):
For ephemeral responses, event draining task has no backpressure mechanism.


Performance Considerations ⚡

Strengths:

  • 1MB chunked storage balances overhead vs granularity
  • 100ms DB update throttling prevents excessive writes
  • Bounded channels (100 events, 10 body chunks)
  • Lazy loading in UI via useHttpRequestBody hook
  • Large content warnings prevent rendering issues

Concerns:

1. Loading entire body into memory (lib.rs:814, useHttpRequestBody.ts:21-31):
Could cause OOM for very large requests. Recommend:

  • Size limit for in-memory loading (e.g., 10MB)
  • Stream to temp file for larger bodies
  • Or implement pagination/streaming

2. Synchronous blob writes (http_request.rs:367):
SendableBody::Bytes writes happen synchronously during request execution, potentially blocking. Consider async/background task.


Security Concerns 🔒

High Priority

1. No blob storage limits:
No enforced quotas - could enable disk space exhaustion. Recommend workspace/global quotas and garbage collection.

Medium Priority

2. Sensitive data in blobs:
Request bodies may contain credentials/PII with no encryption at rest. Consider:

  • Encryption at rest
  • User preference to disable capture
  • Security documentation

3. Access control (lib.rs:814-827):
cmd_http_request_body doesn't verify workspace access. May be OK if Tauri handles this, but worth verifying.


Test Coverage 🧪

Strengths:

  • Excellent unit tests (11 in blob_manager, 5 in tee_reader)

Gaps:

  • No integration tests for full capture flow
  • No error path testing
  • No cleanup tests
  • Missing frontend tests

Recommend adding integration tests for full flow, error handling, and large bodies.


Recommendations Summary 🎯

Must Fix (Before Merge)

  1. ⚠️ Fix TeeReader backpressure busy-wait
  2. ⚠️ Add size limits for in-memory body loading (OOM prevention)

Should Fix (Soon After)

  1. Add error logging for event processing
  2. Implement blob storage quotas/limits
  3. Add integration tests

Nice to Have (Future)

  1. Blob encryption at rest
  2. User preference to disable capture
  3. Optimize synchronous Bytes blob writes
  4. Module-level documentation

Overall Assessment

Well-architected feature with clean separation of concerns and thoughtful performance considerations. The blob storage design is sound, streaming implementation handles backpressure well, and the UI/UX is polished.

Main concerns:

  1. TeeReader backpressure needs refinement
  2. Memory safety for large bodies needs attention
  3. Production needs storage quotas

Recommendation: Approve with minor changes

Great work! The design shows careful consideration of performance, maintainability, and user experience. 🎉

@claude
Copy link

claude bot commented Dec 28, 2025

PR Review: Store and show request body in UI

Overview

This PR adds functionality to capture and display HTTP request bodies in the UI, implementing a robust streaming architecture with blob storage. Overall, this is a well-structured implementation with good attention to performance and user experience.


Code Quality & Best Practices ✅

Strengths:

  • Excellent separation of concerns: The ResponseContext abstraction cleanly handles both persisted and ephemeral responses
  • Well-tested: Comprehensive unit tests for TeeReader and BlobManager covering edge cases (empty data, receiver dropped, large data, chunked reads)
  • Good error handling: Graceful degradation when blob writing fails, receiver drops, or storage errors occur
  • Performance-conscious: Throttled DB updates (100ms intervals) and bounded channels with backpressure prevent excessive writes
  • Clear documentation: Inline comments explain design decisions (e.g., backpressure mechanisms, chunk sizes)

Areas for improvement:

  1. TeeReader backpressure implementation (src-tauri/yaak-http/src/tee_reader.rs:38-44)

    • The current backpressure mechanism using try_send + wake_by_ref may cause busy-waiting
    • Consider using async send() instead to properly block without spinning:
    // Current: may cause busy-waiting
    Err(mpsc::error::TrySendError::Full(_)) => {
        cx.waker().wake_by_ref();
        return Poll::Pending;
    }
    
    // Consider: proper async blocking (though this requires refactoring)
    • If keeping try_send, document why this approach was chosen over async send
  2. Error handling could be more granular (src-tauri/src/http_request.rs:361-368)

    • When blob writing fails, the error is appended to the response error string, but this might not give users enough context
    • Consider logging the full error details at error! level for debugging
  3. Magic number could be a constant (src-tauri/src/http_request.rs:488)

    const UPDATE_INTERVAL_MS: u128 = 100; // Already defined - good!

    This is already done correctly ✅


Potential Bugs & Issues ⚠️

  1. Race condition in maybe_blob_write_handle cleanup (src-tauri/src/http_request.rs:354-368)

    • If the blob write task panics (not just errors), the handle will return Err(JoinError), but the code only handles Ok(Err(e))
    • Consider:
    if let Some(handle) = maybe_blob_write_handle {
        match handle.await {
            Ok(Err(e)) => { /* existing error handling */ }
            Err(join_err) if join_err.is_panic() => {
                // Handle panic in blob writer task
            }
            _ => {}
        }
    }
  2. Potential integer overflow (src-tauri/src/http_request.rs:572, 634)

    r.request_content_length = Some(total_bytes as i64);
    • On 64-bit systems, usize can exceed i64::MAX, causing overflow
    • Consider using try_into() with error handling or documenting the assumption that request bodies won't exceed 8 exabytes
  3. Missing validation in BlobContext::insert_chunk (src-tauri/yaak-models/src/blob_manager.rs:74-80)

    • No validation that chunk_index is non-negative (though i32 type allows negatives)
    • No validation that chunk data isn't empty (could bloat DB)
    • Consider adding assertions or early returns for invalid inputs
  4. Database migration is additive-only (src-tauri/yaak-models/migrations/20251221100000_request-content-length.sql)

    • The migration adds a nullable column, which is good for backward compatibility
    • No data migration strategy if users have existing responses - they'll have NULL for request_content_length
    • This appears intentional but should be documented

Performance Considerations 🚀

Good:

  • ✅ Bounded channels prevent memory exhaustion (10 chunks ~10MB for body, 100 events)
  • ✅ Throttled DB updates (100ms intervals) reduce write amplification
  • ✅ Chunked storage (1MB chunks) enables streaming large bodies without loading into memory
  • ✅ UI prompts before rendering large bodies (>2MB) to prevent performance issues
  • ✅ Query invalidation keys include requestContentLength to avoid stale data

Concerns:

  1. DB connection pooling (src-tauri/yaak-models/src/blob_manager.rs:58-63)

    • The code uses .expect() on pool lock and connection acquisition
    • Under heavy load, this could panic if pool is exhausted
    • Consider returning Result and handling pool exhaustion gracefully
  2. Memory usage during stream capture (src-tauri/src/http_request.rs:391-395)

    • Bounded channel has capacity of 10 chunks (10MB in flight max)
    • For very fast uploads + slow DB writes, this could cause request slowdown (by design)
    • Document this tradeoff or make buffer size configurable
  3. Sequential chunk writes (src-tauri/src/http_request.rs:601-624)

    • Chunks are written one at a time to the DB (good for bounded memory)
    • For slow disk I/O, this could slow down HTTP uploads
    • Current approach is reasonable, but worth monitoring
  4. Query performance on body_chunks table (src-tauri/yaak-models/blob_migrations/00000000000000_init.sql:12)

    • Index on (body_id, chunk_index) is optimal for ordered retrieval ✅
    • Consider adding index on created_at if implementing cleanup by age

Security Concerns 🔒

Good security practices:

  • ✅ No SQL injection vulnerabilities (parameterized queries)
  • ✅ Blob storage isolated from main DB
  • ✅ No sensitive data logged in error messages
  • ✅ Proper UTF-8 decoding with fatal: false prevents panics on invalid UTF-8

Minor concerns:

  1. Unbounded blob storage growth (src-tauri/yaak-models/src/blob_manager.rs)

    • No mechanism to garbage collect old request bodies
    • Over time, blob DB could grow indefinitely
    • Consider implementing:
      • Automatic cleanup of bodies for deleted responses
      • TTL-based cleanup
      • User-configurable size limits
  2. Error messages might leak paths (src-tauri/src/http_request.rs:361-365)

    • Error message includes storage error which might contain file paths
    • Generally low risk for a local desktop app, but worth noting
    • Current implementation is acceptable

Test Coverage 📊

Excellent test coverage:

  • TeeReader: 5 comprehensive tests covering normal, edge, and error cases
  • BlobManager: 9 tests covering CRUD, ordering, isolation, and large data
  • ✅ Tests use in-memory SQLite for fast, isolated testing

Missing tests:

  • ❌ Integration test for full request body capture flow (stream → TeeReader → chunks → retrieval)
  • ❌ Test for blob write task failure and error propagation to response
  • ❌ Test for concurrent chunk writes (race conditions)
  • ❌ Test for ResponseContext update logic with DB rollback scenarios

Recommendation:
Add at least one integration test in src-tauri/src/http_request.rs that:

  1. Creates a request with a stream body
  2. Sends it through execute_transaction
  3. Verifies chunks are written correctly
  4. Retrieves body via cmd_http_request_body and validates content

Additional Observations

  1. Generated files in .gitattributes

    • Correctly marks bindings and WASM pkg as generated (lines 3-4)
  2. Frontend implementation is clean

    • Proper use of React Query with placeholder data
    • Large body confirmation UX is well-designed
    • Error states are handled appropriately
  3. TypeScript types are consistent

    • Generated bindings properly reflect Rust model changes
    • requestContentLength properly typed as number | null
  4. Migration naming is clear

    • Timestamp + descriptive name follows best practices

Summary & Recommendations

Overall Assessment: Strong implementation ⭐⭐⭐⭐ (4/5)

This PR demonstrates solid engineering with good attention to performance, error handling, and UX. The architecture is well-designed and the code is clean and maintainable.

Must-fix before merge:

  • None - no blocking issues

Should consider:

  1. Add integration test for full capture flow
  2. Handle task panic in maybe_blob_write_handle cleanup
  3. Consider garbage collection strategy for blob storage
  4. Document/validate integer overflow assumptions for request_content_length

Nice to have:

  1. Make buffer sizes configurable via settings
  2. Add metrics/logging for blob storage size growth
  3. Document backpressure mechanism choice in TeeReader

Great work! 🎉

@claude
Copy link

claude bot commented Dec 28, 2025

Pull Request Review: Store and Show Request Body in UI

Overview

This PR implements functionality to capture, store, and display HTTP request bodies in the UI. The implementation involves significant changes across both the Rust backend (Tauri) and TypeScript frontend, introducing a new blob storage system for chunked body data.

Code Quality & Architecture

✅ Strengths

  1. Well-structured blob storage system: The new BlobManager and BlobContext implementation in src-tauri/yaak-models/src/blob_manager.rs is clean, well-tested, and follows good Rust patterns with proper separation of concerns.

  2. Comprehensive testing: Excellent test coverage for critical components:

    • TeeReader has 5 thorough tests covering edge cases (empty data, receiver dropped, large data, chunked reads)
    • BlobManager has 11 tests covering all CRUD operations and edge cases
    • Tests demonstrate good attention to detail
  3. Backpressure handling: The TeeReader implementation properly handles backpressure when the channel is full (lines 40-44 in tee_reader.rs), preventing memory issues during large uploads.

  4. Refactored error handling: The new ResponseContext abstraction (lines 38-50 in http_request.rs) provides a cleaner API for managing both persisted and ephemeral responses.

⚠️ Areas for Improvement

1. Type inconsistency in content length fields (Minor)

Location: src-tauri/src/http_request.rs:337, src-tauri/yaak-models/src/models.rs

The PR adds several content-length fields with inconsistent types:

  • content_length: i64 (changed from i32)
  • content_length_compressed: i64
  • request_content_length: i64

However, there's still mixing of types in the conversion code:

// Line 423
r.content_length = http_response.content_length.map(|l| l as i64);

// Line 530
r.content_length = Some(written_bytes as i64);

Recommendation: Ensure consistent use of i64 throughout and consider using u64 since content lengths cannot be negative. Also verify that the TypeScript bindings properly handle 64-bit integers (JavaScript's Number type may lose precision for very large values).

2. Potential race condition in blob writing (Medium Priority)

Location: src-tauri/src/http_request.rs:354-362

The request body is captured in a background task, but if the task fails, the error is only logged after the HTTP transaction completes:

if let Some(handle) = maybe_blob_write_handle {
    if let Ok(Err(e)) = handle.await {
        // Update response with storage error
    }
}

Issues:

  • If the blob write task panics instead of returning an error, it won't be detected
  • The error is only checked after the entire response completes, potentially hiding issues

Recommendation:

  • Use .await? on the handle if blob storage is critical, or
  • Add logging/monitoring when spawning the task to detect panics
  • Consider whether partial blob storage should fail the entire request

3. DB update throttling could miss final update (Low Priority)

Location: src-tauri/src/http_request.rs:524-533

The throttling logic updates the DB at most every 100ms, but there's no guarantee the final accurate byte count is written before the function returns:

if elapsed_since_update >= UPDATE_INTERVAL_MS {
    response_ctx.update(|r| {
        r.elapsed = start.elapsed().as_millis() as i32;
        r.content_length = Some(written_bytes as i64);
    })?;
    last_update_time = now;
}

Recommendation: Add a final DB update after the loop completes to ensure the exact byte count is persisted (similar to what's done on line 542-546).

4. Missing cleanup for ephemeral request bodies

Location: src-tauri/src/http_request.rs:444-450

Ephemeral responses use the OS temp directory but there's no cleanup mechanism:

let temp_dir = std::env::temp_dir().join("yaak-ephemeral-responses");
create_dir_all(&temp_dir).await?;
temp_dir.join(uuid::Uuid::new_v4().to_string())

Recommendation: Implement cleanup for old ephemeral files, either on startup or via a background task. Consider using tempfile::NamedTempFile which auto-cleans on drop.

Performance Considerations

✅ Good Practices

  1. Chunked storage: 1MB chunk size (line 39 in http_request.rs) is reasonable for balancing memory usage and I/O efficiency.

  2. Bounded channels: All channels use bounded sizes (10 chunks ~10MB) to prevent unbounded memory growth.

  3. Lazy loading in UI: The useHttpRequestBody hook only fetches when requestContentLength > 0 and uses React Query for caching.

  4. Large response warning: The ConfirmLargeResponseRequest component prevents rendering bodies over 2MB until explicitly requested.

⚠️ Potential Issues

  1. Full data load in cmd_http_request_body (Medium Priority)
    Location: src-tauri/src/lib.rs:810-827
let chunks = app_handle.blobs().get_chunks(&body_id)?;
let body: Vec<u8> = chunks.into_iter().flat_map(|c| c.data).collect();
Ok(Some(body))

This loads the entire request body into memory. For large uploads (100MB+), this could cause issues.

Recommendation: Consider streaming the data or implementing pagination, similar to how response bodies use file-based storage.

  1. No index on body_chunks.body_id - Wait, there IS an index!
    Actually, I see the index is properly created in the migration: CREATE INDEX idx_body_chunks_body_id ON body_chunks (body_id, chunk_index);

Security Concerns

✅ Good Security Practices

  1. No SQL injection risks: All queries use parameterized statements (params![...]).

  2. Proper error handling: Sensitive paths and errors are properly handled without leaking information.

  3. Bounded memory usage: Channel backpressure prevents DoS via memory exhaustion.

⚠️ Considerations

  1. Disk space exhaustion (Low Priority)

The blob storage doesn't appear to have size limits. A malicious or buggy client could fill disk space by uploading large request bodies repeatedly.

Recommendation: Consider implementing:

  • Per-workspace quota limits
  • Automatic cleanup of old blobs
  • Disk space monitoring with warnings
  1. Binary data handling

Request bodies are stored as raw bytes but decoded as UTF-8 in the UI (line 30 in useHttpRequestBody.ts):

const bodyText = new TextDecoder('utf-8', { fatal: false }).decode(body);

Note: The { fatal: false } option is good - it prevents crashes on invalid UTF-8. However, binary data will display garbled. Consider detecting binary content and showing a hex viewer or "binary content" message.

Test Coverage

✅ Excellent Coverage

  • TeeReader: 5 comprehensive tests
  • BlobManager: 11 tests covering all operations
  • Both include edge cases and error scenarios

❌ Missing Tests

  1. No integration tests for the full request body capture flow
  2. No tests for cmd_http_request_body Tauri command
  3. No tests for UI components (ConfirmLargeResponseRequest, RequestBodyViewer)
  4. No tests for error scenarios in write_stream_chunks_to_db

Recommendation: Add at least one integration test that:

  • Sends a real HTTP request with a body
  • Verifies the body is stored correctly in blobs
  • Retrieves it via cmd_http_request_body
  • Validates the data matches

Additional Observations

Minor Issues

  1. Unused contentLengthCompressed field: This field is added to the model but never populated or used. Should it be implemented or removed?

  2. Magic numbers: The 2MB threshold (line 19 in ConfirmLargeResponseRequest.tsx) and 1MB chunk size could be constants with descriptive names.

  3. Logging inconsistency: Some places use debug!, others warn!. The blob write completion uses debug! (lines 560, 583) but might warrant info! for visibility.

  4. TypeScript naming: getRequestBodyText returns { body, bodyText } but the function name only mentions text. Consider renaming to getRequestBody or splitting into two functions.

Overall Assessment

Status: ✅ Approve with minor recommendations

This is a well-implemented feature with good architecture and testing. The core functionality is solid:

  • Clean separation of concerns between blob storage and HTTP handling
  • Proper backpressure and resource management
  • Comprehensive unit tests for critical components
  • Good UX considerations (lazy loading, large file warnings)

The issues identified are mostly minor and can be addressed in follow-up PRs:

  • Critical: None
  • High Priority: None
  • Medium Priority: Type inconsistencies, blob write error handling, full memory load for large bodies
  • Low Priority: Throttling edge case, ephemeral cleanup, disk quotas

Great work on this feature! The code quality is high and demonstrates good understanding of both Rust async patterns and React best practices.


Reviewed with: Claude Code PR Review

@gschier gschier merged commit 26a3e88 into main Dec 28, 2025
3 checks passed
@gschier gschier deleted the request-body branch December 28, 2025 16:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants