High-performance Ethereum blockchain indexer that fetches and stores block headers and transaction data from genesis to the latest finalized block. This indexer is a critical component of the Fossil Light Client infrastructure, providing the block header database that feeds the MMR Builder and Light Client synchronization process.
Fossil Headers DB is a production-ready PostgreSQL-backed indexer designed to efficiently index Ethereum block headers for use by the Fossil Light Client ecosystem. It implements both real-time synchronization and historical backfilling strategies to ensure complete blockchain coverage.
- Dual Indexing Strategy: Simultaneous real-time indexing (quick service) and historical backfilling (batch service)
- Smart Gap Detection: Automatically identifies and fills missing blocks in the database
- Fault Tolerant: Robust retry mechanisms and RPC failover support
- Production Ready: Optimized for AWS ECS deployment with health checks and monitoring
- Light Client Integration: Provides validated block headers for MMR construction and proof generation
- Database Migrations: Automated schema management with SQLx migrations
- HTTP API: Health check endpoints for load balancer integration
The indexer consists of two main services that run concurrently:
Purpose: Real-time synchronization with the Ethereum network
- Polls for the latest finalized blocks every 10 seconds (configurable)
- Indexes new blocks as they become finalized
- Ensures the database stays current with the blockchain tip
Purpose: Historical backfilling and gap filling
- Processes blocks in large batches (1000 blocks by default)
- Automatically detects gaps in the stored data
- Backfills from a configurable starting block to genesis (block 0)
- Optimized for high-throughput historical data ingestion
┌─────────────────────────────────────────────────────────────┐
│ Ethereum Network │
│ (Finalized Block Headers) │
└──────────────────────────┬──────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Fossil Headers DB Indexer │
│ ┌──────────────────┐ ┌──────────────────┐ │
│ │ Quick Indexer │ │ Batch Indexer │ │
│ │ (Real-time sync) │ │ (Backfill/Gaps) │ │
│ └──────────────────┘ └──────────────────┘ │
└──────────────────────────┬──────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ PostgreSQL Database │
│ (Block Headers + Transaction Data) │
└──────────────────────────┬──────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Fossil Light Client │
│ ┌──────────────────┐ ┌──────────────────┐ │
│ │ MMR Builder │ │ Light Client │ │
│ │ (1024 blocks/ │ │ (Continuous │ │
│ │ batch proofs) │ │ sync) │ │
│ └──────────────────┘ └──────────────────┘ │
└──────────────────────────┬──────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Starknet (Fossil Store Contract) │
│ (MMR Roots, Proofs, Fee Data, IPFS CIDs) │
└─────────────────────────────────────────────────────────────┘
- Installation Guide - Set up your development environment
- Quick Start - Get the indexer running in 5 minutes
- Configuration - Environment variables and settings
- Testing - Run tests and verify functionality
- System Overview - High-level architecture and design decisions
- Indexing Services - Quick and Batch indexer deep dive
- Database Schema - PostgreSQL tables and relationships
- RPC Client - Ethereum RPC interaction and retry logic
- Local Development - Running locally with Docker Compose
- Deployment - Production deployment on AWS ECS
- Monitoring - Health checks, logging, and metrics
- Troubleshooting - Common issues and solutions
- Database Management - Migrations, backups, and maintenance
- Light Client Integration - How the Light Client consumes indexed data
- API Reference - HTTP endpoints and responses
- Database Access - Direct database query patterns
- Rust 1.70+ (installed via rustup)
- PostgreSQL 16+ (for production) or Docker (for local development)
- Ethereum RPC endpoint (Alchemy, Infura, or self-hosted node)
# Clone the repository
git clone https://github.com/OilerNetwork/fossil-headers-db.git
cd fossil-headers-db
# Install dependencies and start local environment
make dev-setupThis command will:
- Install Rust dependencies
- Install SQLx CLI for database migrations
- Start PostgreSQL in Docker
- Run database migrations
- Set up the development environment
Create a .env file in the project root:
# Database connection
DB_CONNECTION_STRING=postgresql://postgres:postgres@localhost:5432/postgres
# Ethereum RPC endpoint (replace with your provider)
NODE_CONNECTION_STRING=https://eth-mainnet.g.alchemy.com/v2/YOUR_API_KEY
# HTTP server configuration
ROUTER_ENDPOINT=0.0.0.0:3000
RUST_LOG=info
# Indexer settings
INDEX_TRANSACTIONS=false # Set to true to index transaction data
START_BLOCK_OFFSET=1024 # Start backfilling from latest - 1024 blocks
# Optional: Development mode (loads .env file)
IS_DEV=true# Run the modern indexer service (recommended for production)
make run-indexer
# Or run directly with cargo
cargo run --bin fossil_indexerThe indexer will:
- Connect to PostgreSQL and run migrations
- Start the Quick Indexer (real-time sync)
- Start the Batch Indexer (historical backfill)
- Start the HTTP health check server on port 3000
# Check health endpoint
curl http://localhost:3000/health
# Expected response:
# {"status":"healthy","timestamp":"2025-10-14T12:00:00Z"}
# Check database for indexed blocks
psql postgresql://postgres:postgres@localhost:5432/postgres \
-c "SELECT COUNT(*) as blocks, MIN(number) as min_block, MAX(number) as max_block FROM block_header;"Watch the logs to see indexing progress:
# Quick indexer logs show real-time block processing
[quick_index] Starting quick indexer service
[quick_index] Indexed block 20000000
# Batch indexer logs show backfill progress
[batch_index] Starting batch indexer service
[batch_index] Backfilling from block 19999000 to 0
[batch_index] Indexed batch: blocks 19999000-19998000 (1000 blocks in 12.3s)| Command | Description |
|---|---|
make help |
Show all available commands |
make dev-setup |
Complete dev setup (deps + environment) |
make dev-up |
Start PostgreSQL database |
make dev-down |
Stop development environment |
make dev-clean |
Clean environment (remove volumes) |
make run-indexer |
Run production indexer service |
make run-cli ARGS='...' |
Run legacy CLI tool |
make test |
Run all tests |
make lint |
Run clippy linter |
make format |
Format code with rustfmt |
make build |
Build release binary |
make docker-build |
Build Docker images |
# Start development environment
make dev-up
# Run indexer (in another terminal)
make run-indexer
# Run tests
make test
# Stop environment when done
make dev-down# Build and run with Docker Compose
docker compose -f docker/docker-compose.local.yml up -d
# View logs
docker compose -f docker/docker-compose.local.yml logs -f indexer
# Stop services
docker compose -f docker/docker-compose.local.yml down| Variable | Required | Default | Description |
|---|---|---|---|
DB_CONNECTION_STRING |
Yes | - | PostgreSQL connection URL |
NODE_CONNECTION_STRING |
Yes | - | Ethereum RPC endpoint URL |
ROUTER_ENDPOINT |
No | 0.0.0.0:3000 |
HTTP health check server address |
RUST_LOG |
No | info |
Logging level (error, warn, info, debug, trace) |
INDEX_TRANSACTIONS |
No | false |
Whether to index transaction data (headers only by default) |
START_BLOCK_OFFSET |
No | 1024 |
Blocks before latest to start backfill from |
IS_DEV |
No | false |
Enable development mode (loads .env file) |
The indexer behavior is controlled through the IndexingConfig structure. While defaults are production-ready, you can customize behavior programmatically:
use fossil_headers_db::indexer::lib::{IndexingConfig, start_indexing_services};
let config = IndexingConfig::builder()
.db_conn_string("postgresql://...")
.node_conn_string("https://...")
.should_index_txs(false)
.max_retries(10) // Max retries per block
.poll_interval(10) // Seconds between polls (quick indexer)
.rpc_timeout(300) // RPC timeout in seconds
.rpc_max_retries(5) // Max RPC retries
.index_batch_size(1000) // Blocks per batch
.start_block_offset(1024) // Start backfill from latest - offset
.build()?;GET /healthReturns the health status of the indexer service.
Response:
{
"status": "healthy",
"timestamp": "2025-10-14T12:00:00Z"
}Use Case: Load balancer health checks, monitoring systems
GET /mmr
GET /mmr/<block_number>These endpoints will be implemented to provide MMR state information for the Light Client.
The indexer is designed for production deployment on AWS ECS using Fargate:
- Build and push Docker images:
# Build images
make docker-build
# Tag and push to ECR
aws ecr get-login-password --region us-west-2 | docker login --username AWS --password-stdin YOUR_ACCOUNT.dkr.ecr.us-west-2.amazonaws.com
docker tag fossil-indexer:latest YOUR_ACCOUNT.dkr.ecr.us-west-2.amazonaws.com/fossil-indexer:latest
docker tag fossil-migrate:latest YOUR_ACCOUNT.dkr.ecr.us-west-2.amazonaws.com/fossil-migrate:latest
docker push YOUR_ACCOUNT.dkr.ecr.us-west-2.amazonaws.com/fossil-indexer:latest
docker push YOUR_ACCOUNT.dkr.ecr.us-west-2.amazonaws.com/fossil-migrate:latest- Run database migrations:
The migration service should run as a one-time ECS task before deploying the indexer:
aws ecs run-task \
--cluster fossil-cluster \
--task-definition fossil-migrate \
--launch-type FARGATE \
--network-configuration "awsvpcConfiguration={subnets=[subnet-xxx],securityGroups=[sg-xxx]}"- Deploy indexer service:
Deploy the indexer as a long-running ECS service with load balancer integration:
aws ecs create-service \
--cluster fossil-cluster \
--service-name fossil-indexer \
--task-definition fossil-indexer \
--desired-count 1 \
--launch-type FARGATE \
--load-balancers targetGroupArn=arn:aws:elasticloadbalancing:...See Deployment Guide for complete ECS configuration details.
- Database: PostgreSQL 16+ with 20GB+ storage (grows with indexed data)
- Compute: Minimum 1 vCPU, 2GB RAM (2 vCPU, 4GB RAM recommended)
- Network: Outbound HTTPS access to Ethereum RPC endpoint
- Storage: Database storage grows approximately 10GB per million blocks
Monitor these metrics in production:
- Block Processing Rate: Blocks indexed per minute
- Database Lag: Latest indexed block vs. current finalized block
- RPC Success Rate: Percentage of successful RPC calls
- Gap Count: Number of missing blocks in database
- Health Check Response Time: HTTP endpoint latency
The indexer uses structured logging with the tracing crate:
# Set log level via RUST_LOG environment variable
export RUST_LOG=info # Production (default)
export RUST_LOG=debug # Development
export RUST_LOG=trace # Debugging
# Module-specific logging
export RUST_LOG=fossil_headers_db::indexer=debug,fossil_headers_db::rpc=trace# Health check endpoint
curl http://localhost:3000/health
# Database connection check
psql $DB_CONNECTION_STRING -c "SELECT 1;"
# Check indexer progress
psql $DB_CONNECTION_STRING -c "
SELECT
current_latest_block_number as latest_indexed,
backfilling_block_number as backfill_position,
is_backfilling,
indexing_starting_block_number as target_block,
updated_at
FROM index_metadata;
"1. Database Connection Errors
Error: Failed to connect to database
Solution: Verify DB_CONNECTION_STRING is correct and PostgreSQL is running:
psql $DB_CONNECTION_STRING -c "SELECT 1;"2. RPC Rate Limiting
Error: Too many requests (HTTP 429)
Solution: The indexer has built-in retry logic, but consider:
- Using a premium RPC provider with higher rate limits
- Adjusting
rpc_timeoutandrpc_max_retriesin configuration - Reducing
index_batch_sizeto make fewer concurrent requests
3. Slow Backfilling
Batch indexer taking too long
Solution:
- Increase
index_batch_size(default 1000, try 2000-5000) - Use a faster RPC endpoint (dedicated node vs. shared service)
- Check database disk I/O performance
4. Missing Blocks (Gaps)
Gap detected in block sequence
Solution: The batch indexer automatically detects and fills gaps. Monitor logs:
[batch_index] Detected gap from block X to Y, filling...See Troubleshooting Guide for more details.
Core Technologies:
- Rust (stable) - High-performance systems language
- Tokio - Async runtime for concurrent operations
- Axum - Modern HTTP framework for health endpoints
- SQLx - Compile-time checked SQL queries
- PostgreSQL - Robust relational database
- Reqwest - HTTP client for RPC calls
- Serde - Serialization/deserialization
- Tracing - Structured logging and diagnostics
Infrastructure:
- Docker - Containerization
- Docker Compose - Local development orchestration
- AWS ECS - Production container orchestration
- AWS RDS - Managed PostgreSQL (production)
Before submitting changes:
# Format code
make format
# Run linter
make lint
# Run all tests
make test
# Or run all checks together
make dev-test- Type Safety: Use the validated types (
BlockNumber,BlockHash, etc.) fromtypes.rs - Error Handling: Always use
Result<T>with descriptiveBlockchainErrorvariants - Testing: Write tests for new functionality (unit tests in module, integration tests in
tests/) - Documentation: Add doc comments for public functions and modules
- Linting: Code must pass
cargo clippywith no warnings - Formatting: Code must be formatted with
cargo fmt
fossil-headers-db/
├── src/
│ ├── main.rs # Legacy CLI binary
│ ├── lib.rs # Library root with public API
│ ├── types.rs # Type-safe domain models
│ ├── errors.rs # Error types and handling
│ ├── commands/ # Legacy CLI commands
│ ├── indexer/
│ │ ├── main.rs # Modern indexer binary
│ │ ├── lib.rs # Indexer orchestration
│ │ ├── batch_service.rs # Batch indexer implementation
│ │ └── quick_service.rs # Quick indexer implementation
│ ├── db/
│ │ └── mod.rs # Database connection and operations
│ ├── repositories/
│ │ ├── block_header.rs # Block header repository
│ │ └── index_metadata.rs # Indexer metadata repository
│ ├── rpc/
│ │ └── mod.rs # Ethereum RPC client
│ └── router/
│ └── mod.rs # HTTP health check endpoints
├── migrations/ # SQLx database migrations
├── tests/ # Integration tests
├── docker/ # Docker configurations
│ ├── Dockerfile.indexer # Production indexer image
│ ├── Dockerfile.migrate # Migration runner image
│ └── docker-compose.local.yml # Local development compose
├── docs/ # Comprehensive documentation
├── Makefile # Development commands
└── Cargo.toml # Rust dependencies
Fossil Ecosystem:
- Fossil Light Client - MMR Builder and Light Client using this indexer's data
- Fossil Monorepo - Pitchlake Coprocessor (Proving Service, API, StarkNet contracts)
External Dependencies:
GNU General Public License v3.0
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
See LICENSE for details.
- Documentation: docs/
- Issues: GitHub Issues
- Discussions: GitHub Discussions
Built with ❤️ by the Fossil Team