Adaptive HashMap Studio is an end-to-end playground for modern hash map design. It packages production-ready data structures, CSV-driven workload tooling, automated benchmarks, and rich UIs (PyQt6 Mission Control + Textual TUI) into a single repository. Every feature is exercised through lint/type/test gates and captured in reproducible audits.
Work in progress: Phase 0–2 are complete and thoroughly audited; Phase 3 (deployment & integration) is underway. Expect frequent updates as we continue hardening the platform.
- Three map backends (two-level chaining, Robin Hood, adaptive hybrid) with live migration and guardrails.
- CSV workload generator, profiler, and replay engine with latency reservoirs, JSON summaries, and metrics streaming.
- Snapshot lifecycle: save/load, versioned header + checksum, offline compaction and safe repair.
- Mission Control desktop app featuring telemetry charts, config editor, benchmark suite manager, workload DNA explorer, and a new snapshot inspector with historical replay controls.
- Textual TUI for terminal dashboards, batch benchmark runner, Prometheus-compatible metrics server, and
inspect-snapshotCLI inspections.
Everything in this README is current as of the latest audit (October 10 2025)
- Python: 3.11 or 3.12 (we test against both). Python 3.9/3.10 are no longer supported.
- OS: macOS or Linux shell. Windows users should work inside WSL2.
- Optional extras:
PyQt6,pyqtgraph,numpyfor Mission Control (pip install .[gui]).textual,richfor the terminal dashboard (pip install .[ui]).- Prometheus / Grafana if you plan to ingest
/metricsoutput.
Install everything for development:
python -m venv .venv
source .venv/bin/activate
python -m pip install -U pip
python -m pip install -e .[dev,gui,ui]To capture subprocess coverage locally, export the helper path before running coverage:
export PYTHONPATH="tools/coverage:$PYTHONPATH"
export COVERAGE_PROCESS_START=.coveragercThe editable install registers a hashmap-cli entry point, and you can also invoke the CLI via python -m hashmap_cli … if you prefer module execution.
-
Generate a config (TOML is the canonical format):
python -m hashmap_cli config-wizard --outfile config/config.toml
-
Dry-run a workload to catch CSV issues without executing it:
python -m hashmap_cli run-csv --csv data/workloads/demo.csv --dry-run
-
Replay with metrics + JSON summary:
python -m hashmap_cli --config config/config.toml run-csv \ --csv data/workloads/demo.csv \ --json-summary-out results/json/demo_metrics_session.json \ --metrics-out-dir runs/metrics_demo
-
Launch Mission Control to inspect metrics, snapshots, and configs:
python -m hashmap_cli mission-control
-
Sanity-check snapshots from the CLI or GUI:
python -m hashmap_cli inspect-snapshot --in snapshots/uniform.pkl.gz --limit 10
-
Visualise probe paths to understand collision behaviour:
python -m hashmap_cli --mode fast-lookup probe-visualize \ --operation get --key K1 \ --snapshot snapshots/uniform.pkl.gz
Add
--seed KEY=VALUE(repeatable) to pre-populate an ad-hoc map, or--jsonto receive a machine-readable trace. -
Run the regression suite (all must pass before pushing):
make lint make type make test # pytest (90 passed / 7 skipped as of Oct 10 2025)
Running on Windows runners? Set
PYTEST_TIMEOUT_METHOD=thread(or editpytest.ini) because the defaultsignalmethod is unavailable on Windows.
All three commands are logged in reports/command_run_log.tsv along with the full async audit transcripts.
python -m hashmap_cli mission-control
Major panels:
- Telemetry – Throughput, load-factor, latency histogram, probe distribution, key heatmap, FFT analytics. The header reflects the exact tick/series you’re viewing. The “Keep history between runs” toggle preserves or resets charts between workloads.
- Config Editor – Schema-driven editor with preset refresh/save, fully synced with CLI tooling.
- Snapshot Inspector – Load
.pkl/.pkl.gz, review header metadata, filter/search keys, export history, and monitor load-factor thresholds per snapshot. - Benchmark Suites – Discover and run TOML specs using the batch runner, with live logs and workload DNA analysis.
- Workload DNA – Inspect CSV characteristics (ratios, entropy, hot keys) before execution.
- Probe Visualizer – Load JSON traces exported by
probe-visualize, or run the CLI with--jsoninside the Run Command panel to stream paths directly into Mission Control.
Mission Control honours ADHASH_TOKEN for metrics authentication and gracefully handles headless environments (off-screen smoke tests live in tests/test_mission_control_widgets_qt.py).
This view shows real-time throughput, latency, and probe distribution analytics during live workload runs.
The Config Editor provides a live, schema-driven interface for adjusting backend parameters, saving configuration presets, and synchronizing values with the CLI and TOML configuration files.
It eliminates the need to manually edit config/config.toml and ensures all key-value pairs are validated before runtime.
The Snapshot Inspector provides an interactive viewer for exploring serialized benchmark snapshots (.pkl, .pkl.gz) generated by the CLI or batch runner. It allows developers to quickly audit workload state, review load-factor metadata, and filter individual keys without manually unpacking pickle files.
The Benchmark Suites panel provides a visual batch runner and report generator for executing multi-job TOML specifications. It’s designed for high-throughput testing of multiple backends, parameter combinations, or workload types in a single automated session.
The Workload DNA visualizer provides an interactive graphical analysis of workload structure and hash-map distribution characteristics. It translates raw workload CSV statistics into color-coded histograms that reveal key distribution, probe depth, and entropy patterns in real time.
-
Build the production image:
make docker-build(multi-stage, non-root, health-checked). -
Launch the metrics dashboard quickly:
docker run --rm \ -p 9090:9090 \ -e ADHASH_METRICS_PORT=9090 \ adaptive-hashmap-cli:local serve \ --host 0.0.0.0 \ --port 9090
Binding to
0.0.0.0ensures the mapped port is reachable from the host OS; swap in another host or port if you run behind a reverse proxy. -
Replay a workload inside the container:
docker run --rm \ -v "$(pwd)/data:/data:ro" \ -v "$(pwd)/snapshots:/snapshots" \ -e ADHASH_METRICS_PORT=9090 \ adaptive-hashmap-cli:local run-csv \ --csv /data/workloads/w_uniform.csv \ --metrics-port 9090 \ --metrics-host 0.0.0.0 \ --metrics-out-dir /snapshots/metrics
-
Compose stack (metrics + workload runner):
docker compose up --build(defaults expectdata/workloads/demo.csv). -
Developer-friendly image:
make docker-build-devordocker build -f docker/Dockerfile.devfor an editable environment with.[dev]installed. -
Need a random high port? Set
--metrics-port auto(orADHASH_METRICS_PORT=auto) and the CLI will log the bound port after startup.
See docs/containers/README.md for environment variables, health checks, and release automation hooks.
The following mini-tour lets you experience every major feature with real commands. Feel free to copy/paste line-by-line.
# Step 1A: create a config and adjust thresholds quickly
python -m hashmap_cli config-wizard --outfile runs/demo_config.toml
# Step 1B: generate workloads with different access patterns
mkdir -p runs/workloads
python -m hashmap_cli generate-csv --outfile runs/workloads/w_uniform.csv --ops 50000 --read-ratio 0.8 --seed 7
python -m hashmap_cli generate-csv --outfile runs/workloads/w_skew_adv.csv --ops 50000 --read-ratio 0.6 --key-skew 1.2 --adversarial-ratio 0.15
# Step 1C: profile to see which backend is recommended
python -m hashmap_cli profile --csv runs/workloads/w_uniform.csv
python -m hashmap_cli profile --csv runs/workloads/w_skew_adv.csv --then get HOTKEY# Step 2A: dry-run for validation first
python -m hashmap_cli run-csv --csv runs/workloads/w_uniform.csv --dry-run
# Step 2B: run with metrics streaming + JSON summary + snapshot
mkdir -p runs/metrics_uniform results/json snapshots
python -m hashmap_cli --config runs/demo_config.toml run-csv \
--csv runs/workloads/w_uniform.csv \
--metrics-port 9090 \
--metrics-out-dir runs/metrics_uniform \
--json-summary-out results/json/uniform_summary.json \
--snapshot-out snapshots/uniform.pkl.gz --compress
# Step 2C: view metrics in Mission Control (new snapshot inspector + history controls)
python -m hashmap_cli mission-control
# Step 2D: confirm snapshot metadata from the CLI as well
python -m hashmap_cli inspect-snapshot --in snapshots/uniform.pkl.gz --limit 15 --key "'K1'"# Step 3A: run a skewed workload to force Robin Hood migration + compaction
mkdir -p runs/metrics_skew
python -m hashmap_cli --mode adaptive run-csv \
--csv runs/workloads/w_skew_adv.csv \
--json-summary-out results/json/skew_summary.json \
--metrics-out-dir runs/metrics_skew \
--snapshot-out snapshots/skew.pkl.gz --compress
# Step 3B: inspect guardrail alerts in the TUI (optional)
python scripts/launch_tui.py --metrics-endpoint http://127.0.0.1:9090/api/metrics
# Step 3C: offline compact the Robin Hood snapshot then verify & repair
python -m hashmap_cli compact-snapshot --in snapshots/skew.pkl.gz --out snapshots/skew_compacted.pkl.gz --compress
python -m hashmap_cli verify-snapshot --in snapshots/skew_compacted.pkl.gz --repair --out snapshots/skew_repaired.pkl.gz --verbose# Step 4A: create an alternate config (copy + tweak a field)
python -m hashmap_cli config-edit --infile runs/demo_config.toml --outfile runs/demo_config_candidate.toml --apply-preset default
# Step 4B: run paired benchmarks and collect comparison artifacts
python -m hashmap_cli ab-compare --csv runs/workloads/w_uniform.csv \
--baseline-config runs/demo_config.toml \
--candidate-config runs/demo_config_candidate.toml \
--out-dir results/ab/uniform_demo
# Step 4C: surface throughput/latency deltas in the dashboard
python -m hashmap_cli serve --source results/ab/uniform_demo/artifacts/baseline/metrics/metrics.ndjson \
--compare results/ab/uniform_demo/uniform_demo_baseline_vs_candidate.json# Step 5A: run predefined suites (Markdown/HTML reports in results/)
python -m adhash.batch --spec docs/examples/batch_baseline.toml
# Step 5B: load the suite in Mission Control (Benchmark tab) to see log streaming and Workload DNA results
python -m hashmap_cli mission-controlBy the end of this walkthrough you will have exercised config wizards, workload generation, live metrics (Mission Control + TUI), snapshot verification/repair, A/B comparisons, and batch reporting—the same flows covered in the automated audit.
| Category | Commands | Notes |
|---|---|---|
| Core ops | put, get, del, items |
Work on any backend (--mode adaptive, fast-lookup, etc.). |
| Workloads | generate-csv, profile, run-csv |
run-csv supports snapshots, live metrics, JSON summaries, dry-run validation, throttles. |
| Analytics | workload-dna, ab-compare, inspect-snapshot |
Workload DNA reports skew/collision risk; inspect-snapshot surfaces versioned metadata and key lookups. |
| Config | config-wizard, config-edit |
Schema-driven generator/editor with preset management. |
| Snapshots | compact-snapshot, verify-snapshot |
Offline compaction/repair for RobinHood maps with checksum verification. |
| Observability | serve, mission-control, scripts/launch_tui.py |
Dashboard server, desktop UI, and terminal UI. |
Run python -m hashmap_cli -h for the full command list with flags.
- Add
--jsonfor machine-readable success payloads ({"ok": true, "command": "run-csv", ...}). - Errors surface through standard envelopes (
BadInput,Invariant,Policy,IO) with stable exit codes{0,2,3,4,5}.
- Snapshots use a versioned header + BLAKE2b checksum (
src/adhash/io/snapshot_header.py). Untrusted payloads are rejected. - Saved objects include Robin Hood/Chaining/Adaptive maps;
inspect-snapshotand Mission Control’s inspector expose metadata, filtered previews, and direct key searches. - Configs are dataclass-backed (
src/adhash/config.py) with env overrides.config-editand Mission Control’s editor share the same schema and validation logic.
Typical flows:
python -m hashmap_cli --mode adaptive run-csv --csv data/workloads/w_heavy_adv.csv \
--snapshot-out snapshots/adaptive.pkl.gz --compress
python -m hashmap_cli inspect-snapshot --in snapshots/adaptive.pkl.gz --key "'K1'" --limit 5
python -m hashmap_cli verify-snapshot --in snapshots/adaptive.pkl.gz --repair \
--out snapshots/adaptive_repaired.pkl.gz --verbosepython scripts/launch_tui.py --metrics-endpoint http://127.0.0.1:9090/api/metrics
- Displays backend status, operations, load-factor trends, guardrail alerts, and latency percentiles directly in the terminal.
rto refresh,qto quit. Works with the same/api/metricsJSON endpoint as the dashboard.--probe-json trace.jsonloads a trace on startup; presspto reload after re-exporting from the CLI.- IDE tip: if you launch this from PyCharm/VSC, make sure the run configuration includes the
--metrics-endpointargument; otherwise the script will print the usage banner and exit.
python -m adhash.batch --spec docs/examples/batch_baseline.toml
- Executes multi-run suites (profilers,
run-csvjobs) and emits Markdown/HTML reports underresults/. - Mission Control’s Benchmark pane wraps the runner with a GUI for discovery, config, and log streaming.
See docs/batch_runner.md for spec syntax and report details.
python -m hashmap_cli serve --port 9090 --source runs/metrics_demo/metrics.ndjson --follow
- Serves
/api/metrics,/api/metrics/histogram/{latency,probe},/api/metrics/heatmap,/api/metrics/history, and/api/eventsin JSON. - Optional Prometheus text output at
/metrics(docs/prometheus_grafana.mdhas scrape configs, dashboards, alert examples). - NDJSON artifacts (
--metrics-out-dir,--metrics-max-ticks) retain historical ticks for replay, export, and offline analysis. - Helper scripts:
python scripts/query_metric_endpoint.py http://127.0.0.1:9090/api/metrics [dotted.jq.path]– curl-style JSON fetcher (always pass the URL).python scripts/validate_metrics_ndjson.py runs/metrics_demo/metrics.ndjson– schema validator (requires the NDJSON path). Configure run configurations in your IDE with these arguments; running “naked” will trigger the usage error banner you may have seen.
Set ADHASH_TOKEN to require Authorization: Bearer …. The browser dashboard accepts ?token= for bootstrapping, and both Mission Control & TUI automatically include the header.
- See
docs/ops/runbook.mdfor deployment, smoke-test, and release procedures (Mission Control, TUI, and CLI). make smokeremains the fastest end-to-end validation—run it after config changes and before tagging a release.
- Snapshots: versioned header + checksum, restricted unpickler. Treat third-party files as untrusted; the inspector surfaces checksum mismatches.
- Tokens: metrics dashboard enforces bearer tokens when
ADHASH_TOKENis set. No built-in TLS—front with your own reverse proxy for remote exposure. - Guardrails: load-factor / probe / tombstone thresholds trigger alerts in logs, JSON, dashboards, TUI, and Mission Control banners.
Run locally before every push/release:
make lint # ruff
make type # mypy
make test # pytest (90 passed / 7 skipped as of Oct 10 2025)
# Docker smoke (pending manual run):
# docker build -t adaptive-hashmap-cli:local -f docker/Dockerfile .
# docker compose -f docker/docker-compose.yml up --build
# Start Docker Desktop first (or another daemon) before running the commands above.
Additional smoke:
make smoke– generates a 2k-op workload and validates metrics output.python scripts/validate_metrics_ndjson.py runs/metrics_demo/metrics.ndjson– asserts schema compliance (metrics.v1).- Capture release notes with Towncrier: add a fragment under
newsfragments/(e.g.,feature/1234.add-probe-panel.rst) and runmake releasebefore tagging to updatedocs/CHANGELOG.md.
Comprehensive command transcripts live in audits/audit.md and reports/. reports/command_run_log.tsv captures every automated/manual invocation with timestamps and status codes.
├── README.md # this guide (kept in sync with audits)
├── pyproject.toml # project metadata, dependencies, console scripts
├── Makefile # lint/type/test/build shortcuts
├── LICENSE # Apache 2.0 license text
├── NOTICE # third-party attributions
├── mypy.ini # static type checker configuration
├── config/ # generated configs captured in docs and audits
├── data/ # sample workloads and config fixtures
├── docker/ # container definitions, compose file, entrypoint
├── docs/ # documentation, guides, release notes
│ ├── containers/README.md # container and deployment reference
│ ├── examples/ # batch runner specs used across walkthroughs
│ ├── ops/runbook.md # operations playbook for releases
│ └── upgrade.md # roadmap and phased milestones
├── audits/ # narrative audit logs and external reviews
├── reports/ # command transcripts, HTML/Markdown audit outputs
├── results/ # JSON summaries, A/B comparisons, dashboards
├── runs/ # generated artifacts (metrics, snapshots, configs)
├── snapshots/ # sample serialized maps for demos/tests
├── scripts/ # helper launchers and tooling (Mission Control, TUI, metrics)
├── newsfragments/ # Towncrier release note fragments
├── build/ # local build artifacts (wheel/lib staging)
├── dist/ # distributable archives produced by builds
├── src/ # source packages
│ ├── adhash/ # core package (CLI, data structures, UIs)
│ │ ├── cli/ # CLI command wiring and orchestration
│ │ ├── core/ # hashmap algorithms and utilities
│ │ ├── metrics/ # metrics server and schema helpers
│ │ ├── mission_control/ # PyQt6 desktop application
│ │ ├── tui/ # Textual terminal UI
│ │ ├── workloads/ # workload generation and profiling tools
│ │ └── hashmap_cli.py # console entry point for `hashmap-cli`
│ └── hashmap_cli/__init__.py # namespace marker used by console scripts
└── tests/ # pytest suite (CLI, GUI, metrics, snapshots)
audits/audit.md– authoritative verification log (12 sections + demo), refreshed Oct 2025.docs/config.md– configuration schema and overrides.docs/metrics_schema.md– JSON shapes for ticks, histograms, heatmaps, events.docs/prometheus_grafana.md– integration guide for metrics exporters.docs/batch_runner.md– benchmark specs & output format.docs/workload_schema.md– CSV column definitions and validator behaviour.docs/control_surface.md– REST/Python control surface design and rollout checklist.docs/analysis/probe_visualizer.md– detailed probe tracing guide.
- Keep lint/type/test spotless; add new tests alongside features (UI widgets have Qt smoke tests under
tests/). - Update
audits/audit.mdandreports/command_run_log.tsvwhen adding major commands or artifacts. - Phase 3 work (deployment & integration) is tracked in
docs/upgrade.md: Docker packaging, release automation, Helm/Compose templates, etc.
Questions or patches?
- Just open an issue or PR, just include the commands/tests you ran and highlight any schema or snapshot changes.
Copyright © 2025 Justin Guida Licensed under the Apache License, Version 2.0. See LICENSE for details and NOTICE for attribution.
