Skip to content

melatron/crdt-rga

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RGA CRDT - Replicated Growable Array

A Rust implementation of the Replicated Growable Array (RGA) Conflict-free Replicated Data Type (CRDT), suitable for collaborative text editing and distributed systems where concurrent modifications need to be merged consistently.

Features

  • Conflict-free: Concurrent operations can be applied in any order and will converge to the same state
  • Causally consistent: Operations maintain causal relationships through Lamport timestamps
  • High-performance concurrent operations: Uses lock-free SkipMap for O(log n) operations without global locks
  • Advanced conflict resolution: Enhanced Lamport timestamps with sequence numbers for deterministic ordering
  • Tombstone-based deletion: Supports safe deletion with eventual consistency
  • Thread-safe: Lock-free data structures with atomic operations for maximum concurrency
  • Modular design: Well-structured codebase with separate modules for different concerns
  • Benchmarked performance: Achieves 300,000+ operations/second in concurrent scenarios

Architecture

The implementation is split into several modules:

Core Modules

  • types.rs: Enhanced types including ReplicaId, LamportTimestamp with sequence numbers, UniqueId, and atomic LamportClock
  • node.rs: Node structure representing individual characters, with sentinel constants
  • rga.rs: High-performance RGA implementation using concurrent SkipMap data structure
  • lib.rs: Public API and module declarations

Key Concepts

  1. Enhanced Lamport Timestamps: Each operation is tagged with a logical timestamp consisting of a counter, replica ID, and sequence number, ensuring strong total ordering across all replicas even under extreme concurrency.

  2. UniqueId: Derived from enhanced Lamport timestamps, these provide both unique identification and deterministic conflict resolution.

  3. Concurrent Data Structures: Uses crossbeam-skiplist::SkipMap for lock-free concurrent operations and parking_lot::RwLock for fine-grained node access.

  4. Atomic Clock Management: Thread-safe LamportClock using atomic operations for high-performance timestamp generation.

  5. Sentinel Nodes: Special start and end markers that provide stable reference points for all replicas.

  6. Tombstone Deletion: Instead of physically removing nodes, deletions are marked with a flag to maintain consistency.

Usage

Basic Example

use crdt_rga::RGA;

// Create a new RGA instance for replica 1
let rga = RGA::new(1);

// Insert characters
let start_id = rga.sentinel_start_id();
let h_id = rga.insert_after(start_id, 'H').unwrap();
let e_id = rga.insert_after(h_id, 'e').unwrap();

println!("Content: {}", rga.to_string()); // "He"

// Delete a character
rga.delete(e_id).unwrap();
println!("Content: {}", rga.to_string()); // "H"

Collaborative Editing Example

use crdt_rga::RGA;

// Create two replicas
let rga1 = RGA::new(1);
let rga2 = RGA::new(2);

let start_id = rga1.sentinel_start_id();

// Replica 1 inserts "Hello"
let mut last_id = start_id;
for ch in "Hello".chars() {
    last_id = rga1.insert_after(last_id, ch).unwrap();
}

// Replica 2 concurrently inserts "World"
let mut last_id2 = start_id;
for ch in "World".chars() {
    last_id2 = rga2.insert_after(last_id2, ch).unwrap();
}

// Simulate network replication
for node in rga1.all_nodes() {
    if !node.is_sentinel() {
        rga2.apply_remote_op(node);
    }
}

for node in rga2.all_nodes() {
    if !node.is_sentinel() && rga1.find_node_by_char(node.character).is_none() {
        rga1.apply_remote_op(node);
    }
}

// Both replicas converge to the same state
assert_eq!(rga1.to_string(), rga2.to_string());
println!("Converged content: {}", rga1.to_string());

API Reference

RGA

The main RGA struct provides the following methods:

Construction

  • new(replica_id: ReplicaId) -> Self: Creates a new RGA instance

Operations

  • insert_after(after_id: UniqueId, character: char) -> Result<UniqueId, &'static str>: Inserts a character after the specified node
  • delete(id_to_delete: UniqueId) -> Result<(), &'static str>: Logically deletes a node
  • apply_remote_op(remote_node: Node): Applies a remote operation

Queries

  • to_string() -> String: Returns visible content as a string
  • all_nodes() -> Vec<Node>: Returns all nodes including deleted and sentinel
  • visible_nodes() -> Vec<Node>: Returns only visible nodes
  • total_node_count() -> usize: Total number of nodes
  • visible_node_count() -> usize: Number of visible nodes

Utilities

  • dump_nodes(): Prints all nodes for debugging
  • find_node_by_char(character: char) -> Option<UniqueId>: Finds a node by character
  • sentinel_start_id() -> UniqueId: Gets the start sentinel ID
  • sentinel_end_id() -> UniqueId: Gets the end sentinel ID

Types

  • ReplicaId: Type alias for u64, identifies each replica
  • LamportTimestamp: Logical timestamp with counter and replica ID
  • UniqueId: Unique identifier derived from Lamport timestamp

Node

Represents individual characters in the RGA:

pub struct Node {
    pub id: UniqueId,
    pub character: char,
    pub is_deleted: bool,
}

Running the Examples

The project includes comprehensive examples demonstrating various aspects of the RGA:

# Run the main example
cargo run

# Run all tests
cargo test

# Run tests with output
cargo test -- --nocapture

Testing

The codebase includes extensive unit tests covering:

  • Type conversions and ordering
  • Node operations and visibility
  • Basic RGA operations
  • Concurrent operations and convergence
  • Error handling

Run tests with:

cargo test

Benchmarking

The implementation includes comprehensive benchmarks measuring:

  • Sequential vs concurrent operation performance
  • Multi-replica synchronization throughput
  • Conflict resolution under extreme load
  • Memory usage patterns with tombstones

Run benchmarks with:

cargo bench

Example performance results:

  • 325,000+ ops/sec for concurrent insertions
  • 1.9x speedup over sequential operations
  • Perfect conflict resolution across 8+ replicas
  • Sub-millisecond convergence for typical workloads

Implementation Details

Ordering

The RGA uses Lamport timestamps to establish a total order across all operations. When two operations have the same counter value, the replica ID is used as a tiebreaker, ensuring deterministic ordering.

Concurrency

The implementation uses lock-free concurrent data structures:

  • crossbeam-skiplist::SkipMap: Lock-free ordered map for storing nodes
  • parking_lot::RwLock: Fine-grained read-write locks for individual nodes
  • Atomic operations: For Lamport clock management and counters
  • Thread-safe design: Multiple threads can safely operate concurrently without global locks

This design achieves significant performance improvements over traditional mutex-based approaches, with measured speedups of 1.5-2x in concurrent scenarios.

Memory Management

Deleted nodes are retained as tombstones to maintain consistency. In a production implementation, you might want to add garbage collection for tombstones that are no longer needed for conflict resolution.

Performance Characteristics

  • Insert: O(log n) with lock-free SkipMap, highly concurrent
  • Delete: O(log n) for lookup + O(1) for atomic flag update
  • Query: O(n) for string conversion, O(log n) for individual lookups
  • Memory: O(n) where n includes tombstones
  • Concurrency: Lock-free operations scale with CPU cores
  • Throughput: 300,000+ operations/second measured in benchmarks

Future Improvements

  • Garbage collection for old tombstones
  • Serialization/deserialization for network transmission
  • Position-based insertion API
  • Batch operations for even better performance
  • Custom conflict resolution strategies
  • WASM compilation for web environments
  • Network protocol implementations
  • Persistent storage backends

License

This project is provided as an educational implementation of the RGA CRDT algorithm.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published