Skip to content

Conversation

@ZocoLini
Copy link
Collaborator

@ZocoLini ZocoLini commented Dec 16, 2025

This change aims to refactor the background worker that persists the data when receives a command into a time based. SegmentCache keeps in memory the segments evicted to fast load if requested while waiting for someone to make the call to persist them (the new time based worker in this case). Once they are persisted they are dropped from memory.

Using criterion benches this change improved 10% performance when requesting items randomly. It also increases performance when storing items because the eviction is no longer persisting in the moment

Summary by CodeRabbit

  • Refactor

    • Reworked background persistence to run on a fixed timer instead of command-driven signaling.
    • Introduced in-memory eviction cache to retain evicted segments for controlled persistence.
    • Simplified segment state machine and persistence flow for more predictable saves.
    • Shutdown now performs direct, synchronous persistence and stops the background worker deterministically.
  • Tests

    • Shortened a test by removing an artificial delay to reflect the new direct persistence timing.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 16, 2025

Walkthrough

Refactors background persistence from a command-driven worker to a timer-driven loop that periodically calls persist_evicted; segment cache now retains evicted segments in-memory for later persistence and the segment state machine is simplified from Clean/Dirty/Saving to Clean/Dirty. Several call signatures and worker-related fields were removed.

Changes

Cohort / File(s) Summary
Background worker & manager
dash-spv/src/storage/manager.rs
Removed WorkerCommand and mpsc channel; replaced command-driven worker with a timer loop that calls persist_evicted every 5s; removed worker_tx and last_index_save_count; changed stop_worker to a synchronous abort-based stop.
Segment cache, eviction & persistence
dash-spv/src/storage/segments.rs
Added evicted: HashMap<u32, Segment<I>>; eviction moves dirty segments into evicted instead of immediate disk removal; added persist_evicted; Persistable trait gained path metadata constants and lost make_save_command; segment accessor visibility narrowed; removed Saving state (now Clean/Dirty).
Storage state & shutdown flow
dash-spv/src/storage/state.rs
shutdown signature changed from &mut self to &self; removed worker-command shutdown coordination; save_dirty now persists headers/filters/index directly (calls save_index_to_disk) instead of routing through worker.
Header storage callsite
dash-spv/src/storage/headers.rs
store_items_at_height calls updated to remove passing self/owner argument; explicit early drop(reverse_index) before continue was removed (lock now held until function end).
Tests (timing)
dash-spv/tests/reverse_index_test.rs
Removed a short tokio::time::sleep after storing headers; test now proceeds to shutdown immediately without the previous delay.

Sequence Diagram(s)

mermaid
sequenceDiagram
autonumber
participant API as Caller
participant Manager as DiskStorageManager
participant Cache as SegmentCache
participant Evicted as EvictedMap
participant Disk as Disk I/O
Note over Manager,Cache: Startup: timer-based worker created
API->>Manager: store_items_at_height(headers, height)
Manager->>Cache: store_items_at_height(headers, height)
Cache-->>Cache: may mark Segment Dirty or evict LRU
alt Evicted created
Cache->>Evicted: move dirty segment -> evicted map
end
Note over Manager,Manager: Background ticker (every 5s)
Manager->>Cache: persist_evicted()
Cache->>Evicted: iterate evicted segments
Evicted->>Disk: persist(segment)
Disk-->>Evicted: ack persisted
Evicted-->>Cache: remove/purge persisted entries

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

  • Review areas to focus:
    • dash-spv/src/storage/segments.rs: eviction logic, transitions between active/evicted maps, and correctness of persistence ordering.
    • dash-spv/src/storage/manager.rs: timer loop correctness, worker abort semantics, and potential races during stop/shutdown.
    • dash-spv/src/storage/state.rs: shutdown/save_dirty path ensuring index and pending dirty data are flushed reliably.
    • dash-spv/src/storage/headers.rs: lock lifetime change (removed explicit drop(reverse_index)) — verify no deadlocks or performance regressions.
    • Tests: timing-sensitive tests updated; ensure no flakiness introduced by removing command-based coordination.

Poem

🐰
Commands hushed, the timer hums on cue,
Evicted bunnies hide until saved true.
Clean and Dirty, two states to keep,
Persistence hops while disks softly sleep.
A tiny rabbit cheers the new view.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 30.56% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Refactor: storage worker reworked' directly reflects the main change: refactoring the background worker from an on-demand to time-based persistence model, which is the core focus of this PR.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch refactor/storage-worker-reworked

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
dash-spv/src/storage/segments.rs (1)

474-480: Silent data loss when inserting at out-of-bounds offset.

When offset > self.items.len(), the item is silently discarded with only a log message. This could lead to data corruption that's difficult to debug. Consider returning a Result or using debug_assert! to catch this during development.

🧹 Nitpick comments (4)
dash-spv/src/storage/segments.rs (2)

404-436: Async function uses synchronous file I/O.

Segment::load is marked async but uses synchronous std::fs::File::open and BufReader, which can block the async runtime. Consider using tokio::fs::File with async reads, or spawn the blocking operation on a dedicated thread pool via tokio::task::spawn_blocking.

Per coding guidelines: "Use async/await throughout the codebase, built on tokio runtime".


445-447: Minor: Synchronous create_dir_all in async context.

fs::create_dir_all is synchronous and could briefly block the async runtime. Consider using tokio::fs::create_dir_all for consistency with the async atomic_write that follows.

dash-spv/src/storage/manager.rs (1)

156-164: Worker handle not cleared after abort.

After abort(), worker_handle remains Some(...) with an aborted handle. While start_worker overwrites it, consider clearing it to None for cleaner state management and to avoid holding a reference to an aborted task.

-    pub(super) fn stop_worker(&self) {
+    pub(super) fn stop_worker(&mut self) {
         if let Some(handle) = &self.worker_handle {
             handle.abort();
         }
+        self.worker_handle = None;
     }

Note: This would require changing the signature to &mut self, which may conflict with shutdown(&self) in state.rs.

dash-spv/src/storage/state.rs (1)

436-447: Sequential write lock acquisition in save_dirty.

Each persist() call acquires a write lock on its respective cache. While this is safe, if persistence takes significant time, it could increase latency for concurrent readers. Consider whether the locks need to be held during the entire I/O operation or only for copying data.

This is a performance consideration for the future rather than a correctness issue.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f9234fd and a558988.

📒 Files selected for processing (5)
  • dash-spv/src/storage/headers.rs (1 hunks)
  • dash-spv/src/storage/manager.rs (3 hunks)
  • dash-spv/src/storage/segments.rs (12 hunks)
  • dash-spv/src/storage/state.rs (7 hunks)
  • dash-spv/tests/reverse_index_test.rs (0 hunks)
💤 Files with no reviewable changes (1)
  • dash-spv/tests/reverse_index_test.rs
🧰 Additional context used
📓 Path-based instructions (3)
**/*.rs

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.rs: Never hardcode network parameters, addresses, or keys in Rust code
Use proper error types with thiserror crate and propagate errors appropriately in Rust code
Use tokio runtime for async operations in Rust code
Use conditional compilation with feature flags for optional features
Write unit tests for new functionality in Rust
Format Rust code with cargo fmt before committing
Ensure clippy passes with all warnings as errors on all Rust code

**/*.rs: Each crate keeps sources in src/; unit tests live alongside code with #[cfg(test)]
Maintain MSRV of 1.89 for Rust code
Format Rust code with rustfmt (see rustfmt.toml); run cargo fmt --all before commits
Lint Rust code with clippy; avoid unwrap()/expect() in library code; use error types (e.g., thiserror)
Use snake_case for function and variable names in Rust
Use UpperCamelCase for types and traits in Rust
Use SCREAMING_SNAKE_CASE for constants in Rust
Follow mixed editions (2021/2024) and crate-specific idioms; prefer async via tokio where applicable
Unit tests should be placed near code with descriptive names (e.g., test_parse_address_mainnet)
Use #[ignore] for network-dependent or long-running tests; run with -- --ignored flag
Never commit secrets or real keys; avoid logging sensitive data in Rust code
Keep test vectors deterministic in Rust test code

Files:

  • dash-spv/src/storage/manager.rs
  • dash-spv/src/storage/headers.rs
  • dash-spv/src/storage/state.rs
  • dash-spv/src/storage/segments.rs
dash-spv/**/*.rs

📄 CodeRabbit inference engine (dash-spv/CLAUDE.md)

dash-spv/**/*.rs: Use async/await throughout the codebase, built on tokio runtime
Use Arc for trait objects to enable runtime polymorphism for NetworkManager and StorageManager
Use Tokio channels for inter-component message passing between async tasks
Maintain minimum Rust version (MSRV) of 1.89 and use only compatible syntax and features

Files:

  • dash-spv/src/storage/manager.rs
  • dash-spv/src/storage/headers.rs
  • dash-spv/src/storage/state.rs
  • dash-spv/src/storage/segments.rs
dash-spv/src/storage/**/*.rs

📄 CodeRabbit inference engine (dash-spv/CLAUDE.md)

Store headers in 10,000-header segments with index files in headers/ directory, filter headers and compact block filters in filters/ directory, and state in state/ directory

Files:

  • dash-spv/src/storage/manager.rs
  • dash-spv/src/storage/headers.rs
  • dash-spv/src/storage/state.rs
  • dash-spv/src/storage/segments.rs
🧠 Learnings (19)
📚 Learning: 2025-12-16T09:03:55.811Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: dash-spv/CLAUDE.md:0-0
Timestamp: 2025-12-16T09:03:55.811Z
Learning: Applies to dash-spv/src/storage/**/*.rs : Store headers in 10,000-header segments with index files in headers/ directory, filter headers and compact block filters in filters/ directory, and state in state/ directory

Applied to files:

  • dash-spv/src/storage/manager.rs
  • dash-spv/src/storage/headers.rs
  • dash-spv/src/storage/state.rs
  • dash-spv/src/storage/segments.rs
📚 Learning: 2025-06-26T15:54:02.509Z
Learnt from: DCG-Claude
Repo: dashpay/rust-dashcore PR: 0
File: :0-0
Timestamp: 2025-06-26T15:54:02.509Z
Learning: The `StorageManager` trait in `dash-spv/src/storage/mod.rs` uses `&mut self` methods but is also `Send + Sync`, and implementations often use interior mutability for concurrency. This can be confusing, so explicit documentation should clarify thread-safety expectations and the rationale for the API design.

Applied to files:

  • dash-spv/src/storage/manager.rs
  • dash-spv/src/storage/headers.rs
  • dash-spv/src/storage/state.rs
  • dash-spv/src/storage/segments.rs
📚 Learning: 2025-12-16T09:03:55.811Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: dash-spv/CLAUDE.md:0-0
Timestamp: 2025-12-16T09:03:55.811Z
Learning: Applies to dash-spv/src/sync/**/*.rs : Implement sequential phase-based synchronization via SyncManager with phases progressing in order: Headers → Masternode List → Filter Headers → Filters → Blocks

Applied to files:

  • dash-spv/src/storage/manager.rs
  • dash-spv/src/storage/headers.rs
  • dash-spv/src/storage/state.rs
  • dash-spv/src/storage/segments.rs
📚 Learning: 2025-12-16T09:03:55.811Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: dash-spv/CLAUDE.md:0-0
Timestamp: 2025-12-16T09:03:55.811Z
Learning: Applies to dash-spv/**/*.rs : Use async/await throughout the codebase, built on tokio runtime

Applied to files:

  • dash-spv/src/storage/manager.rs
  • dash-spv/src/storage/state.rs
📚 Learning: 2025-12-01T08:01:18.174Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-01T08:01:18.174Z
Learning: Organize workspace with crates: `dash`, `hashes`, `internals`, `dash-network`, `dash-spv`, `key-wallet`, `rpc-*`, utilities (`fuzz`, `test-utils`), and FFI crates (`*-ffi`)

Applied to files:

  • dash-spv/src/storage/manager.rs
  • dash-spv/src/storage/state.rs
📚 Learning: 2025-12-16T09:03:55.811Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: dash-spv/CLAUDE.md:0-0
Timestamp: 2025-12-16T09:03:55.811Z
Learning: Applies to dash-spv/**/*.rs : Use Arc<dyn TraitName> for trait objects to enable runtime polymorphism for NetworkManager and StorageManager

Applied to files:

  • dash-spv/src/storage/manager.rs
  • dash-spv/src/storage/state.rs
📚 Learning: 2025-12-01T07:59:58.608Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: dash-spv-ffi/CLAUDE.md:0-0
Timestamp: 2025-12-01T07:59:58.608Z
Learning: Applies to dash-spv-ffi/src/**/*.rs : Use thread-local storage for error propagation via `dash_spv_ffi_get_last_error()` function

Applied to files:

  • dash-spv/src/storage/manager.rs
  • dash-spv/src/storage/state.rs
📚 Learning: 2025-12-16T09:03:55.811Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: dash-spv/CLAUDE.md:0-0
Timestamp: 2025-12-16T09:03:55.811Z
Learning: Applies to dash-spv/**/*.rs : Use Tokio channels for inter-component message passing between async tasks

Applied to files:

  • dash-spv/src/storage/manager.rs
📚 Learning: 2025-06-26T15:51:03.126Z
Learnt from: DCG-Claude
Repo: dashpay/rust-dashcore PR: 0
File: :0-0
Timestamp: 2025-06-26T15:51:03.126Z
Learning: In Rust network code, using unwrap() on SystemTime::now().duration_since(UNIX_EPOCH) can panic if the system clock is before UNIX_EPOCH. Use unwrap_or with a safe fallback (e.g., Duration::from_secs(0)) to prevent panics in edge cases.

Applied to files:

  • dash-spv/src/storage/manager.rs
📚 Learning: 2025-12-16T09:03:55.811Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: dash-spv/CLAUDE.md:0-0
Timestamp: 2025-12-16T09:03:55.811Z
Learning: Applies to dash-spv/src/network/**/*.rs : Implement configurable timeouts with recovery mechanisms for network operations

Applied to files:

  • dash-spv/src/storage/manager.rs
📚 Learning: 2025-12-01T08:00:37.919Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: key-wallet/CLAUDE.md:0-0
Timestamp: 2025-12-01T08:00:37.919Z
Learning: Applies to key-wallet/**/*.rs : Never panic in library code; use `Result<T>` for all fallible operations with custom `Error` type variants and provide context in error messages

Applied to files:

  • dash-spv/src/storage/manager.rs
📚 Learning: 2025-02-25T06:13:52.858Z
Learnt from: QuantumExplorer
Repo: dashpay/rust-dashcore PR: 51
File: dash/src/sml/masternode_list/scores_for_quorum.rs:50-73
Timestamp: 2025-02-25T06:13:52.858Z
Learning: ScoreHash is a cryptographic hash in the rust-dashcore library, and therefore does not need collision handling when used as a key in collections like BTreeMap due to the extremely low probability of collisions.

Applied to files:

  • dash-spv/src/storage/manager.rs
  • dash-spv/src/storage/state.rs
📚 Learning: 2025-12-01T08:00:37.919Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: key-wallet/CLAUDE.md:0-0
Timestamp: 2025-12-01T08:00:37.919Z
Learning: Applies to key-wallet/**/managed_account/**/*.rs : Implement atomic state updates in ManagedAccount: when processing transactions, update address usage state, UTXO set, transaction history, and balance calculations together to maintain consistency

Applied to files:

  • dash-spv/src/storage/state.rs
📚 Learning: 2025-06-26T16:01:37.609Z
Learnt from: DCG-Claude
Repo: dashpay/rust-dashcore PR: 0
File: :0-0
Timestamp: 2025-06-26T16:01:37.609Z
Learning: The mempool tracking infrastructure (UnconfirmedTransaction, MempoolState, configuration, and mempool_filter.rs) is fully implemented and integrated in the Dash SPV client as of this PR, including client logic, FFI APIs, and tests.

Applied to files:

  • dash-spv/src/storage/state.rs
  • dash-spv/src/storage/segments.rs
📚 Learning: 2025-12-01T08:00:50.618Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: swift-dash-core-sdk/CLAUDE.md:0-0
Timestamp: 2025-12-01T08:00:50.618Z
Learning: When making changes to Rust FFI in dash-spv-ffi, rebuild with `cargo build --release` and run Swift tests to verify integration

Applied to files:

  • dash-spv/src/storage/state.rs
📚 Learning: 2025-12-16T09:03:55.811Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: dash-spv/CLAUDE.md:0-0
Timestamp: 2025-12-16T09:03:55.811Z
Learning: Applies to dash-spv/tests/**/*.rs : Organize tests into unit tests (in-module), integration tests (tests/ directory), real network tests (with live Dash Core nodes), and performance benchmarks

Applied to files:

  • dash-spv/src/storage/state.rs
📚 Learning: 2025-12-01T08:00:37.919Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: key-wallet/CLAUDE.md:0-0
Timestamp: 2025-12-01T08:00:37.919Z
Learning: Keep test vectors updated with Dash Core, monitor for new DIPs affecting wallet structure, maintain backward compatibility for serialized wallets, and update derivation paths for Platform features

Applied to files:

  • dash-spv/src/storage/state.rs
📚 Learning: 2025-12-01T08:01:18.174Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-01T08:01:18.174Z
Learning: Applies to **/*.rs : Keep test vectors deterministic in Rust test code

Applied to files:

  • dash-spv/src/storage/state.rs
📚 Learning: 2025-12-01T07:59:58.608Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: dash-spv-ffi/CLAUDE.md:0-0
Timestamp: 2025-12-01T07:59:58.608Z
Learning: Applies to dash-spv-ffi/tests/unit/**/*.rs : Add corresponding unit tests in `tests/unit/` for each new FFI function

Applied to files:

  • dash-spv/src/storage/state.rs
🧬 Code graph analysis (1)
dash-spv/src/storage/state.rs (1)
dash-spv/src/storage/headers.rs (2)
  • headers (21-21)
  • save_index_to_disk (66-74)
🔇 Additional comments (9)
dash-spv/src/storage/segments.rs (3)

214-225: Eviction logic correctly preserves dirty segments.

The LRU eviction moves dirty segments to the evicted map to ensure data isn't lost before the timer-based persistence runs. This aligns well with the PR objective of decoupling eviction from immediate persistence.


390-390: Segment size differs from documented guideline.

The segment size is set to 50,000 items, but coding guidelines specify 10,000-header segments. If this is an intentional change for performance optimization, consider updating the documentation to reflect the new segment size.

Based on coding guidelines: "Store headers in 10,000-header segments with index files in headers/ directory".


518-549: Good test coverage for eviction semantics.

The test correctly validates the LRU eviction behavior with MAX_SEGMENTS + 1 segments, ensuring segments are properly evicted and reloaded from the eviction map or disk.

dash-spv/src/storage/headers.rs (1)

16-34: Clean API simplification.

The method now directly calls store_items_at_height(headers, height) without passing self, aligning with the new direct persistence model. The reverse index update logic remains correct.

dash-spv/src/storage/manager.rs (1)

144-154: Timer-driven persistence loop looks correct.

The 5-second interval for persist_evicted provides a good balance between durability and performance, aligning with the PR objective of moving eviction persistence out of the immediate store path.

dash-spv/src/storage/state.rs (4)

429-447: Shutdown correctly persists all dirty data.

The shutdown method properly stops the worker first, then calls save_dirty to persist all segment caches and the header index. This ensures data durability on graceful shutdown.


349-392: clear() correctly stops worker before filesystem operations.

Stopping the worker before removing files and clearing in-memory state prevents race conditions. The retry logic for remove_dir_all handles transient filesystem races well.


691-694: Trait implementation correctly adapts signature.

The trait requires &mut self for shutdown, and the implementation correctly delegates to Self::shutdown(&self). This is valid Rust and maintains trait compatibility.


852-874: Excellent test for shutdown durability.

This test verifies that headers stored after save_dirty() are still persisted via shutdown(), confirming the new persistence model works correctly across restarts.

@github-actions github-actions bot added the merge-conflict The PR conflicts with the target branch. label Dec 17, 2025
@github-actions
Copy link

This PR has merge conflicts with the base branch. Please rebase or merge the base branch into your branch to resolve them.

@github-actions github-actions bot removed the merge-conflict The PR conflicts with the target branch. label Dec 18, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
dash-spv/src/storage/manager.rs (1)

137-141: abort() may leave dirty data unpersisted.

This concern was raised in a previous review. Using abort() terminates the worker immediately without a final persist_evicted call. While shutdown() in state.rs calls save_dirty() after stop_worker(), code paths like clear() and clear_filters() that only call stop_worker() could lose data if dirty segments exist.

🧹 Nitpick comments (4)
dash-spv/src/storage/manager.rs (1)

121-130: Worker loop lacks error resilience and could block concurrent access.

The background worker acquires write locks on all three caches sequentially every 5 seconds. Consider two improvements:

  1. If any persist_evicted call panics, the worker task will terminate silently without restarting.
  2. Holding write locks sequentially may cause brief contention during persistence cycles.
🔎 Consider wrapping each persist call for resilience:
 let worker_handle = tokio::spawn(async move {
     let mut ticker = tokio::time::interval(Duration::from_secs(5));

     loop {
         ticker.tick().await;

-        block_headers.write().await.persist_evicted().await;
-        filter_headers.write().await.persist_evicted().await;
-        filters.write().await.persist_evicted().await;
+        if let Err(e) = std::panic::catch_unwind(std::panic::AssertUnwindSafe(|| async {
+            block_headers.write().await.persist_evicted().await;
+        })) {
+            tracing::error!("Panic in block_headers persistence: {:?}", e);
+        }
+        // Similar for filter_headers and filters...
     }
 });

Alternatively, the current approach may be acceptable if persistence failures are rare and a restart is the expected recovery path.

dash-spv/src/storage/segments.rs (3)

74-79: Unbounded evicted map could cause memory growth.

The evicted map has no size limit. If segments are evicted faster than the 5-second persistence interval can flush them (e.g., rapid sequential access across many segments), the evicted map could grow unbounded. Consider adding a maximum evicted size or blocking eviction when the evicted queue is too large.


223-233: Use immutable iterator since nothing is mutated.

iter_mut() is used but only last_accessed is read for comparison. Use iter() instead.

🔎 Apply this diff:
-            let key_to_evict =
-                self.segments.iter_mut().min_by_key(|(_, s)| s.last_accessed).map(|(k, v)| (*k, v));
-
-            if let Some((key, _)) = key_to_evict {
+            let key_to_evict =
+                self.segments.iter().min_by_key(|(_, s)| s.last_accessed).map(|(k, _)| *k);
+
+            if let Some(key) = key_to_evict {
                 if let Some(segment) = self.segments.remove(&key) {
                     if segment.state == SegmentState::Dirty {
                         self.evicted.insert(key, segment);
                     }
                 }
             }

480-502: Blocking I/O in async context.

Segment::persist uses synchronous fs::create_dir_all on line 487 within an async function. While atomic_write is async, the directory creation could block the async runtime briefly.

🔎 Consider using tokio::fs for consistency:
-        if let Err(e) = fs::create_dir_all(path.parent().unwrap()) {
+        if let Err(e) = tokio::fs::create_dir_all(path.parent().unwrap()).await {
             return Err(StorageError::WriteFailed(format!("Failed to persist segment: {}", e)));
         }

This is a minor optimization since directory creation is typically fast, but it maintains consistency with the async pattern used elsewhere.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a558988 and 1f168b7.

📒 Files selected for processing (3)
  • dash-spv/src/storage/manager.rs (3 hunks)
  • dash-spv/src/storage/segments.rs (12 hunks)
  • dash-spv/src/storage/state.rs (8 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • dash-spv/src/storage/state.rs
🧰 Additional context used
📓 Path-based instructions (3)
**/*.rs

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.rs: Never hardcode network parameters, addresses, or keys in Rust code
Use proper error types with thiserror crate and propagate errors appropriately in Rust code
Use tokio runtime for async operations in Rust code
Use conditional compilation with feature flags for optional features
Write unit tests for new functionality in Rust
Format Rust code with cargo fmt before committing
Ensure clippy passes with all warnings as errors on all Rust code

**/*.rs: Each crate keeps sources in src/; unit tests live alongside code with #[cfg(test)]
Maintain MSRV of 1.89 for Rust code
Format Rust code with rustfmt (see rustfmt.toml); run cargo fmt --all before commits
Lint Rust code with clippy; avoid unwrap()/expect() in library code; use error types (e.g., thiserror)
Use snake_case for function and variable names in Rust
Use UpperCamelCase for types and traits in Rust
Use SCREAMING_SNAKE_CASE for constants in Rust
Follow mixed editions (2021/2024) and crate-specific idioms; prefer async via tokio where applicable
Unit tests should be placed near code with descriptive names (e.g., test_parse_address_mainnet)
Use #[ignore] for network-dependent or long-running tests; run with -- --ignored flag
Never commit secrets or real keys; avoid logging sensitive data in Rust code
Keep test vectors deterministic in Rust test code

Files:

  • dash-spv/src/storage/manager.rs
  • dash-spv/src/storage/segments.rs
dash-spv/**/*.rs

📄 CodeRabbit inference engine (dash-spv/CLAUDE.md)

dash-spv/**/*.rs: Use async/await throughout the codebase, built on tokio runtime
Use Arc for trait objects to enable runtime polymorphism for NetworkManager and StorageManager
Use Tokio channels for inter-component message passing between async tasks
Maintain minimum Rust version (MSRV) of 1.89 and use only compatible syntax and features

Files:

  • dash-spv/src/storage/manager.rs
  • dash-spv/src/storage/segments.rs
dash-spv/src/storage/**/*.rs

📄 CodeRabbit inference engine (dash-spv/CLAUDE.md)

Store headers in 10,000-header segments with index files in headers/ directory, filter headers and compact block filters in filters/ directory, and state in state/ directory

Files:

  • dash-spv/src/storage/manager.rs
  • dash-spv/src/storage/segments.rs
🧠 Learnings (12)
📚 Learning: 2025-12-16T09:03:55.811Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: dash-spv/CLAUDE.md:0-0
Timestamp: 2025-12-16T09:03:55.811Z
Learning: Applies to dash-spv/src/storage/**/*.rs : Store headers in 10,000-header segments with index files in headers/ directory, filter headers and compact block filters in filters/ directory, and state in state/ directory

Applied to files:

  • dash-spv/src/storage/manager.rs
  • dash-spv/src/storage/segments.rs
📚 Learning: 2025-12-16T09:03:55.811Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: dash-spv/CLAUDE.md:0-0
Timestamp: 2025-12-16T09:03:55.811Z
Learning: Applies to dash-spv/src/sync/**/*.rs : Implement sequential phase-based synchronization via SyncManager with phases progressing in order: Headers → Masternode List → Filter Headers → Filters → Blocks

Applied to files:

  • dash-spv/src/storage/manager.rs
  • dash-spv/src/storage/segments.rs
📚 Learning: 2025-06-26T15:54:02.509Z
Learnt from: DCG-Claude
Repo: dashpay/rust-dashcore PR: 0
File: :0-0
Timestamp: 2025-06-26T15:54:02.509Z
Learning: The `StorageManager` trait in `dash-spv/src/storage/mod.rs` uses `&mut self` methods but is also `Send + Sync`, and implementations often use interior mutability for concurrency. This can be confusing, so explicit documentation should clarify thread-safety expectations and the rationale for the API design.

Applied to files:

  • dash-spv/src/storage/manager.rs
  • dash-spv/src/storage/segments.rs
📚 Learning: 2025-12-16T09:03:55.811Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: dash-spv/CLAUDE.md:0-0
Timestamp: 2025-12-16T09:03:55.811Z
Learning: Applies to dash-spv/**/*.rs : Use async/await throughout the codebase, built on tokio runtime

Applied to files:

  • dash-spv/src/storage/manager.rs
📚 Learning: 2025-06-26T16:02:42.390Z
Learnt from: DCG-Claude
Repo: dashpay/rust-dashcore PR: 0
File: :0-0
Timestamp: 2025-06-26T16:02:42.390Z
Learning: Passing exclusive mutable references to network and storage managers into the sync manager in async Rust can lead to borrow checker issues or runtime contention if concurrent access is needed elsewhere. It's advisable to document this architectural tradeoff and consider refactoring to interior mutability or message passing for shared access as the codebase evolves.

Applied to files:

  • dash-spv/src/storage/manager.rs
📚 Learning: 2025-02-25T06:13:52.858Z
Learnt from: QuantumExplorer
Repo: dashpay/rust-dashcore PR: 51
File: dash/src/sml/masternode_list/scores_for_quorum.rs:50-73
Timestamp: 2025-02-25T06:13:52.858Z
Learning: ScoreHash is a cryptographic hash in the rust-dashcore library, and therefore does not need collision handling when used as a key in collections like BTreeMap due to the extremely low probability of collisions.

Applied to files:

  • dash-spv/src/storage/manager.rs
📚 Learning: 2025-12-16T09:03:55.811Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: dash-spv/CLAUDE.md:0-0
Timestamp: 2025-12-16T09:03:55.811Z
Learning: Applies to dash-spv/**/*.rs : Use Tokio channels for inter-component message passing between async tasks

Applied to files:

  • dash-spv/src/storage/manager.rs
📚 Learning: 2025-12-01T08:00:37.919Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: key-wallet/CLAUDE.md:0-0
Timestamp: 2025-12-01T08:00:37.919Z
Learning: Applies to key-wallet/**/{wallet,account,managed_account}/**/*.rs : Use BTreeMap for ordered data structures (accounts, transactions) and HashMap for quick lookups (address to index mapping) to optimize data structure performance

Applied to files:

  • dash-spv/src/storage/manager.rs
📚 Learning: 2025-12-01T07:59:58.608Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: dash-spv-ffi/CLAUDE.md:0-0
Timestamp: 2025-12-01T07:59:58.608Z
Learning: Applies to dash-spv-ffi/src/**/*.rs : Use thread-local storage for error propagation via `dash_spv_ffi_get_last_error()` function

Applied to files:

  • dash-spv/src/storage/manager.rs
📚 Learning: 2025-12-16T09:03:55.811Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: dash-spv/CLAUDE.md:0-0
Timestamp: 2025-12-16T09:03:55.811Z
Learning: Applies to dash-spv/**/*.rs : Use Arc<dyn TraitName> for trait objects to enable runtime polymorphism for NetworkManager and StorageManager

Applied to files:

  • dash-spv/src/storage/manager.rs
📚 Learning: 2025-12-16T09:03:55.811Z
Learnt from: CR
Repo: dashpay/rust-dashcore PR: 0
File: dash-spv/CLAUDE.md:0-0
Timestamp: 2025-12-16T09:03:55.811Z
Learning: Applies to dash-spv/src/network/**/*.rs : Implement configurable timeouts with recovery mechanisms for network operations

Applied to files:

  • dash-spv/src/storage/manager.rs
📚 Learning: 2025-06-26T16:01:37.609Z
Learnt from: DCG-Claude
Repo: dashpay/rust-dashcore PR: 0
File: :0-0
Timestamp: 2025-06-26T16:01:37.609Z
Learning: The mempool tracking infrastructure (UnconfirmedTransaction, MempoolState, configuration, and mempool_filter.rs) is fully implemented and integrated in the Dash SPV client as of this PR, including client logic, FFI APIs, and tests.

Applied to files:

  • dash-spv/src/storage/segments.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: fuzz (dash_outpoint_string)
  • GitHub Check: fuzz (dash_deserialize_address)
  • GitHub Check: fuzz (dash_script_bytes_to_asm_fmt)
  • GitHub Check: fuzz (dash_deserialize_script)
  • GitHub Check: fuzz (dash_deserialize_witness)
  • GitHub Check: fuzz (hashes_sha1)
  • GitHub Check: fuzz (hashes_ripemd160)
  • GitHub Check: fuzz (hashes_sha512_256)
  • GitHub Check: fuzz (hashes_json)
  • GitHub Check: fuzz (hashes_sha256)
  • GitHub Check: fuzz (dash_deserialize_amount)
  • GitHub Check: fuzz (hashes_sha512)
  • GitHub Check: fuzz (hashes_cbor)
  • GitHub Check: fuzz (dash_deser_net_msg)
  • GitHub Check: fuzz (dash_deserialize_block)
  • GitHub Check: RPC Tests (stable, true)
  • GitHub Check: SPV Components Tests
  • GitHub Check: Core Components Tests
  • GitHub Check: Pre-commit (macos-latest)
  • GitHub Check: Pre-commit (ubuntu-latest)
🔇 Additional comments (5)
dash-spv/src/storage/segments.rs (5)

405-406: Verify segment size differs from coding guidelines.

ITEMS_PER_SEGMENT is set to 50,000, but the coding guidelines specify "10,000-header segments". Please confirm this deviation is intentional.

Based on learnings, the guidelines state: "Store headers in 10,000-header segments with index files in headers/ directory".


355-362: LGTM - persist_evicted correctly clears after iteration.

The method properly iterates over evicted segments, persists each one, logs errors without panicking, and clears the evicted map afterward. This addresses the previous review comment about duplicate persistence logic.


389-394: Clean state model simplification.

The simplified SegmentState enum (Clean/Dirty only) removes the previous Saving intermediate state. This is appropriate for the new timer-based persistence model where state transitions are synchronous within the persist call.


313-352: LGTM - store_items delegation and height tracking.

The store_items method correctly delegates to store_items_at_height using next_height(), and store_items_at_height properly updates tip_height after storing all items.


569-599: Test correctly validates eviction and reload behavior.

The test_segment_cache_eviction test properly validates the LRU eviction policy by exceeding MAX_ACTIVE_SEGMENTS and verifying segments can be reloaded with correct data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants