Tonbo

Website | Rust Doc | Blog | Community

Tonbo is an embedded database for serverless and edge runtimes. Your data is stored as Parquet on S3, coordination happens through a manifest, and compute stays fully stateless.

Why Tonbo?

Serverless compute is stateless, but your data isn't. Tonbo bridges this gap:

Async-first: The entire storage and query engine is fully async, built for serverless and edge environments.
No server to manage: Data lives on S3, coordination via manifest, compute is stateless
Arrow-native: Define rich data type, declarative schemas, query with zero-copy RecordBatch
Runs anywhere: Tokio, WASM, edge runtimes, or as a storage engine for building your own data infrastructure.
Open formats: Standard Parquet files readable by any tool

When to use Tonbo?

Build serverless or edge applications that need a durable state layer without running a database.
Store append-heavy or event-like data directly in S3 and query it with low overhead.
Embed a lightweight MVCC + Parquet storage engine inside your own data infrastructure.
Run workloads in WASM or Cloudflare Workers that require structured persistence.

Quick Start

use tonbo::{db::{AwsCreds, ObjectSpec, S3Spec}, prelude::*};

#[derive(Record)]
struct User {
    #[metadata(k = "tonbo.key", v = "true")]
    id: String,
    name: String,
    score: Option<i64>,
}

// Open on S3
let s3 = S3Spec::new("my-bucket", "data/users", AwsCreds::from_env()?);
let db = DbBuilder::from_schema(User::schema())?
    .object_store(ObjectSpec::s3(s3))?.open().await?;

// Insert
let users = vec![User { id: "u1".into(), name: "Alice".into(), score: Some(100) }];
let mut builders = User::new_builders(users.len());
builders.append_rows(users);
db.ingest(builders.finish().into_record_batch()).await?;

// Query
let filter = Predicate::gt(ColumnRef::new("score"), ScalarValue::from(80_i64));
let results = db.scan().filter(filter).collect().await?;

For local development, use .on_disk("/tmp/users")? instead. See examples/ for more.

Installation

cargo add [email protected] tokio

Or add to Cargo.toml:

[dependencies]
tonbo = "0.4.0-a0"
tokio = { version = "1", features = ["rt-multi-thread", "macros"] }

Examples

Run with cargo run --example <name>:

01_basic: Define schema, insert, and query in 30 lines
02_transaction: MVCC transactions with upsert, delete, and read-your-writes
02b_snapshot: Consistent point-in-time reads while writes continue
03_filter: Predicates: eq, gt, in, is_null, and, or, not
04_s3: Store Parquet files on S3/R2/MinIO with zero server config
05_scan_options: Projection pushdown reads only the columns you need
06_composite_key: Multi-column keys for time-series and partitioned data
07_streaming: Process millions of rows without loading into memory
08_nested_types: Deep struct nesting + Lists stored as Arrow StructArray
09_time_travel: Query historical snapshots via MVCC timestamps

Architecture

Tonbo implements a merge-tree optimized for object storage: writes go to WAL → MemTable → Parquet SSTables, with MVCC for snapshot isolation and a manifest for coordination via compare-and-swap:

Stateless compute: A worker only needs to read and update the manifest; no long-lived coordinator is required.
Object storage CAS: The manifest is committed using compare-and-swap on S3, so any function can safely participate in commits.
Immutable data: Data files are write-once Parquet SSTables, which matches the strengths of S3 and other object stores.

See docs/overview.md for the full design.

Documentation

User Guide: tonbo.io/docs: tutorials and concepts
API Reference: docs.rs/tonbo: full Rust API documentation
RFCs: docs/rfcs/: design documents for contributors

Project status

Tonbo is currently in alpha. APIs may change, and we're actively iterating based on feedback. We recommend starting with development and non-critical workloads before moving to production.

Features

Storage

Parquet files on object storage (S3, R2) or local filesystem
Manifest-driven coordination (CAS commits, no server needed)
(in-progress) Remote compaction (offload to serverless functions)
(in-progress) Branching (git-like fork and merge for datasets)
Time-window compaction strategy

Schema & Query

Arrow-native schemas (#[derive(Record)] or dynamic Schema)
Projection pushdown (read only needed columns)
Zero-copy reads via Arrow RecordBatch
(in-progress) Filter pushdown (predicates evaluated at storage layer)

Backends

Local filesystem
S3 / S3-compatible (R2, MinIO)
(in-progress) OPFS (browser storage)

Runtime

Integrations

DataFusion TableProvider
Postgres Foreign Data Wrapper

License

Apache License 2.0. See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 651 Commits
.githooks		.githooks
.github/workflows		.github/workflows
docs		docs
examples		examples
predicate		predicate
src		src
tests		tests
.gitignore		.gitignore
AGENTS.md		AGENTS.md
Cargo.toml		Cargo.toml
README.md		README.md
rust-toolchain.toml		rust-toolchain.toml
rustfmt.toml		rustfmt.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Tonbo

Why Tonbo?

When to use Tonbo?

Quick Start

Installation

Examples

Architecture

Documentation

Project status

Features

License

About

Uh oh!

Releases 4

Packages

Uh oh!

Contributors 25

Languages

tonbo-io/tonbo

Folders and files

Latest commit

History

Repository files navigation

Tonbo

Why Tonbo?

When to use Tonbo?

Quick Start

Installation

Examples

Architecture

Documentation

Project status

Features

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Contributors 25

Languages

Packages