Skip to content

End-to-end toolkit for smart hash maps, CLI, Mission Control (PyQt6) dashboard, Textual TUI, workload analytics, snapshots, and automated benchmarks.

License

Notifications You must be signed in to change notification settings

jguida941/adaptive-hashmap-studio

🔑 Adaptive Hash Map Studio

Python 3.11+ Tests Nightly codecov Docs Mission Control Terminal TUI Snapshots

Adaptive HashMap Studio is an end-to-end playground for modern hash map design. It packages production-ready data structures, CSV-driven workload tooling, automated benchmarks, and rich UIs (PyQt6 Mission Control + Textual TUI) into a single repository. Every feature is exercised through lint/type/test gates and captured in reproducible audits.

Work in progress: Phase 0–2 are complete and thoroughly audited; Phase 3 (deployment & integration) is underway. Expect frequent updates as we continue hardening the platform.

Adaptive Hash Map Studio logo

Key capabilities include:

  • Three map backends (two-level chaining, Robin Hood, adaptive hybrid) with live migration and guardrails.
  • CSV workload generator, profiler, and replay engine with latency reservoirs, JSON summaries, and metrics streaming.
  • Snapshot lifecycle: save/load, versioned header + checksum, offline compaction and safe repair.
  • Mission Control desktop app featuring telemetry charts, config editor, benchmark suite manager, workload DNA explorer, and a new snapshot inspector with historical replay controls.
  • Textual TUI for terminal dashboards, batch benchmark runner, Prometheus-compatible metrics server, and inspect-snapshot CLI inspections.

Everything in this README is current as of the latest audit (October 10 2025)

Requirements

  • Python: 3.11 or 3.12 (we test against both). Python 3.9/3.10 are no longer supported.
  • OS: macOS or Linux shell. Windows users should work inside WSL2.
  • Optional extras:
    • PyQt6, pyqtgraph, numpy for Mission Control (pip install .[gui]).
    • textual, rich for the terminal dashboard (pip install .[ui]).
    • Prometheus / Grafana if you plan to ingest /metrics output.

Install everything for development:

python -m venv .venv
source .venv/bin/activate
python -m pip install -U pip
python -m pip install -e .[dev,gui,ui]

To capture subprocess coverage locally, export the helper path before running coverage:

export PYTHONPATH="tools/coverage:$PYTHONPATH"
export COVERAGE_PROCESS_START=.coveragerc

The editable install registers a hashmap-cli entry point, and you can also invoke the CLI via python -m hashmap_cli … if you prefer module execution.

Quick Start

  1. Generate a config (TOML is the canonical format):

    python -m hashmap_cli config-wizard --outfile config/config.toml
  2. Dry-run a workload to catch CSV issues without executing it:

    python -m hashmap_cli run-csv --csv data/workloads/demo.csv --dry-run
  3. Replay with metrics + JSON summary:

    python -m hashmap_cli --config config/config.toml run-csv \
      --csv data/workloads/demo.csv \
      --json-summary-out results/json/demo_metrics_session.json \
      --metrics-out-dir runs/metrics_demo
  4. Launch Mission Control to inspect metrics, snapshots, and configs:

    python -m hashmap_cli mission-control
  5. Sanity-check snapshots from the CLI or GUI:

    python -m hashmap_cli inspect-snapshot --in snapshots/uniform.pkl.gz --limit 10
  6. Visualise probe paths to understand collision behaviour:

    python -m hashmap_cli --mode fast-lookup probe-visualize \
      --operation get --key K1 \
      --snapshot snapshots/uniform.pkl.gz

    Add --seed KEY=VALUE (repeatable) to pre-populate an ad-hoc map, or --json to receive a machine-readable trace.

  7. Run the regression suite (all must pass before pushing):

    make lint
    make type
    make test   # pytest (90 passed / 7 skipped as of Oct 10 2025)

Running on Windows runners? Set PYTEST_TIMEOUT_METHOD=thread (or edit pytest.ini) because the default signal method is unavailable on Windows.

All three commands are logged in reports/command_run_log.tsv along with the full async audit transcripts.

Mission Control (PyQt6)

python -m hashmap_cli mission-control

Major panels:

  • Telemetry – Throughput, load-factor, latency histogram, probe distribution, key heatmap, FFT analytics. The header reflects the exact tick/series you’re viewing. The “Keep history between runs” toggle preserves or resets charts between workloads.
  • Config Editor – Schema-driven editor with preset refresh/save, fully synced with CLI tooling.
  • Snapshot Inspector – Load .pkl/.pkl.gz, review header metadata, filter/search keys, export history, and monitor load-factor thresholds per snapshot.
  • Benchmark Suites – Discover and run TOML specs using the batch runner, with live logs and workload DNA analysis.
  • Workload DNA – Inspect CSV characteristics (ratios, entropy, hot keys) before execution.
  • Probe Visualizer – Load JSON traces exported by probe-visualize, or run the CLI with --json inside the Run Command panel to stream paths directly into Mission Control.

Mission Control honours ADHASH_TOKEN for metrics authentication and gracefully handles headless environments (off-screen smoke tests live in tests/test_mission_control_widgets_qt.py).


Mission Control Telemetry Dashboard

This view shows real-time throughput, latency, and probe distribution analytics during live workload runs.

Screenshot 2025-10-06 at 1 21 40 AM Screenshot 2025-10-06 at 1 21 50 AM Screenshot 2025-10-06 at 1 21 57 AM Screenshot 2025-10-06 at 1 22 10 AM Screenshot 2025-10-06 at 1 22 20 AM

Config Editor Panel

The Config Editor provides a live, schema-driven interface for adjusting backend parameters, saving configuration presets, and synchronizing values with the CLI and TOML configuration files. It eliminates the need to manually edit config/config.toml and ensures all key-value pairs are validated before runtime.

Screenshot 2025-10-06 at 1 22 32 AM

Snapshot Inspector

The Snapshot Inspector provides an interactive viewer for exploring serialized benchmark snapshots (.pkl, .pkl.gz) generated by the CLI or batch runner. It allows developers to quickly audit workload state, review load-factor metadata, and filter individual keys without manually unpacking pickle files.

Screenshot 2025-10-06 at 1 22 40 AM

Benchmark Suites

The Benchmark Suites panel provides a visual batch runner and report generator for executing multi-job TOML specifications. It’s designed for high-throughput testing of multiple backends, parameter combinations, or workload types in a single automated session.

Screenshot 2025-10-06 at 1 23 22 AM

Workload DNA Visualizer

The Workload DNA visualizer provides an interactive graphical analysis of workload structure and hash-map distribution characteristics. It translates raw workload CSV statistics into color-coded histograms that reveal key distribution, probe depth, and entropy patterns in real time.

Screenshot 2025-10-06 at 1 23 33 AM Screenshot 2025-10-06 at 1 23 45 AM

Containers & Deployment

  • Build the production image: make docker-build (multi-stage, non-root, health-checked).

  • Launch the metrics dashboard quickly:

    docker run --rm \
      -p 9090:9090 \
      -e ADHASH_METRICS_PORT=9090 \
      adaptive-hashmap-cli:local serve \
      --host 0.0.0.0 \
      --port 9090

    Binding to 0.0.0.0 ensures the mapped port is reachable from the host OS; swap in another host or port if you run behind a reverse proxy.

  • Replay a workload inside the container:

    docker run --rm \
      -v "$(pwd)/data:/data:ro" \
      -v "$(pwd)/snapshots:/snapshots" \
      -e ADHASH_METRICS_PORT=9090 \
      adaptive-hashmap-cli:local run-csv \
        --csv /data/workloads/w_uniform.csv \
        --metrics-port 9090 \
        --metrics-host 0.0.0.0 \
        --metrics-out-dir /snapshots/metrics
  • Compose stack (metrics + workload runner): docker compose up --build (defaults expect data/workloads/demo.csv).

  • Developer-friendly image: make docker-build-dev or docker build -f docker/Dockerfile.dev for an editable environment with .[dev] installed.

  • Need a random high port? Set --metrics-port auto (or ADHASH_METRICS_PORT=auto) and the CLI will log the bound port after startup.

See docs/containers/README.md for environment variables, health checks, and release automation hooks.

Hands-On Walkthrough

The following mini-tour lets you experience every major feature with real commands. Feel free to copy/paste line-by-line.

1. Generate & profile workloads

# Step 1A: create a config and adjust thresholds quickly
python -m hashmap_cli config-wizard --outfile runs/demo_config.toml

# Step 1B: generate workloads with different access patterns
mkdir -p runs/workloads
python -m hashmap_cli generate-csv --outfile runs/workloads/w_uniform.csv --ops 50000 --read-ratio 0.8 --seed 7
python -m hashmap_cli generate-csv --outfile runs/workloads/w_skew_adv.csv --ops 50000 --read-ratio 0.6 --key-skew 1.2 --adversarial-ratio 0.15

# Step 1C: profile to see which backend is recommended
python -m hashmap_cli profile --csv runs/workloads/w_uniform.csv
python -m hashmap_cli profile --csv runs/workloads/w_skew_adv.csv --then get HOTKEY

2. Replay workloads with metrics + snapshots

# Step 2A: dry-run for validation first
python -m hashmap_cli run-csv --csv runs/workloads/w_uniform.csv --dry-run

# Step 2B: run with metrics streaming + JSON summary + snapshot
mkdir -p runs/metrics_uniform results/json snapshots
python -m hashmap_cli --config runs/demo_config.toml run-csv \
  --csv runs/workloads/w_uniform.csv \
  --metrics-port 9090 \
  --metrics-out-dir runs/metrics_uniform \
  --json-summary-out results/json/uniform_summary.json \
  --snapshot-out snapshots/uniform.pkl.gz --compress

# Step 2C: view metrics in Mission Control (new snapshot inspector + history controls)
python -m hashmap_cli mission-control

# Step 2D: confirm snapshot metadata from the CLI as well
python -m hashmap_cli inspect-snapshot --in snapshots/uniform.pkl.gz --limit 15 --key "'K1'"

3. Explore guardrails and migrations

# Step 3A: run a skewed workload to force Robin Hood migration + compaction
mkdir -p runs/metrics_skew
python -m hashmap_cli --mode adaptive run-csv \
  --csv runs/workloads/w_skew_adv.csv \
  --json-summary-out results/json/skew_summary.json \
  --metrics-out-dir runs/metrics_skew \
  --snapshot-out snapshots/skew.pkl.gz --compress

# Step 3B: inspect guardrail alerts in the TUI (optional)
python scripts/launch_tui.py --metrics-endpoint http://127.0.0.1:9090/api/metrics

# Step 3C: offline compact the Robin Hood snapshot then verify & repair
python -m hashmap_cli compact-snapshot --in snapshots/skew.pkl.gz --out snapshots/skew_compacted.pkl.gz --compress
python -m hashmap_cli verify-snapshot --in snapshots/skew_compacted.pkl.gz --repair --out snapshots/skew_repaired.pkl.gz --verbose

4. Compare configurations (A/B harness)

# Step 4A: create an alternate config (copy + tweak a field)
python -m hashmap_cli config-edit --infile runs/demo_config.toml --outfile runs/demo_config_candidate.toml --apply-preset default

# Step 4B: run paired benchmarks and collect comparison artifacts
python -m hashmap_cli ab-compare --csv runs/workloads/w_uniform.csv \
  --baseline-config runs/demo_config.toml \
  --candidate-config runs/demo_config_candidate.toml \
  --out-dir results/ab/uniform_demo

# Step 4C: surface throughput/latency deltas in the dashboard
python -m hashmap_cli serve --source results/ab/uniform_demo/artifacts/baseline/metrics/metrics.ndjson \
  --compare results/ab/uniform_demo/uniform_demo_baseline_vs_candidate.json

5. Batch suites & workload analytics

# Step 5A: run predefined suites (Markdown/HTML reports in results/)
python -m adhash.batch --spec docs/examples/batch_baseline.toml

# Step 5B: load the suite in Mission Control (Benchmark tab) to see log streaming and Workload DNA results
python -m hashmap_cli mission-control

By the end of this walkthrough you will have exercised config wizards, workload generation, live metrics (Mission Control + TUI), snapshot verification/repair, A/B comparisons, and batch reporting—the same flows covered in the automated audit.

CLI Surface Overview

Category Commands Notes
Core ops put, get, del, items Work on any backend (--mode adaptive, fast-lookup, etc.).
Workloads generate-csv, profile, run-csv run-csv supports snapshots, live metrics, JSON summaries, dry-run validation, throttles.
Analytics workload-dna, ab-compare, inspect-snapshot Workload DNA reports skew/collision risk; inspect-snapshot surfaces versioned metadata and key lookups.
Config config-wizard, config-edit Schema-driven generator/editor with preset management.
Snapshots compact-snapshot, verify-snapshot Offline compaction/repair for RobinHood maps with checksum verification.
Observability serve, mission-control, scripts/launch_tui.py Dashboard server, desktop UI, and terminal UI.

Run python -m hashmap_cli -h for the full command list with flags.

JSON Envelopes & Exit Codes

  • Add --json for machine-readable success payloads ({"ok": true, "command": "run-csv", ...}).
  • Errors surface through standard envelopes (BadInput, Invariant, Policy, IO) with stable exit codes {0,2,3,4,5}.

Snapshots & Configuration

  • Snapshots use a versioned header + BLAKE2b checksum (src/adhash/io/snapshot_header.py). Untrusted payloads are rejected.
  • Saved objects include Robin Hood/Chaining/Adaptive maps; inspect-snapshot and Mission Control’s inspector expose metadata, filtered previews, and direct key searches.
  • Configs are dataclass-backed (src/adhash/config.py) with env overrides. config-edit and Mission Control’s editor share the same schema and validation logic.

Typical flows:

python -m hashmap_cli --mode adaptive run-csv --csv data/workloads/w_heavy_adv.csv \
  --snapshot-out snapshots/adaptive.pkl.gz --compress

python -m hashmap_cli inspect-snapshot --in snapshots/adaptive.pkl.gz --key "'K1'" --limit 5

python -m hashmap_cli verify-snapshot --in snapshots/adaptive.pkl.gz --repair \
  --out snapshots/adaptive_repaired.pkl.gz --verbose

Textual TUI (Terminal)

python scripts/launch_tui.py --metrics-endpoint http://127.0.0.1:9090/api/metrics

  • Displays backend status, operations, load-factor trends, guardrail alerts, and latency percentiles directly in the terminal.
  • r to refresh, q to quit. Works with the same /api/metrics JSON endpoint as the dashboard.
  • --probe-json trace.json loads a trace on startup; press p to reload after re-exporting from the CLI.
  • IDE tip: if you launch this from PyCharm/VSC, make sure the run configuration includes the --metrics-endpoint argument; otherwise the script will print the usage banner and exit.
image

Batch Benchmark Runner

python -m adhash.batch --spec docs/examples/batch_baseline.toml

  • Executes multi-run suites (profilers, run-csv jobs) and emits Markdown/HTML reports under results/.
  • Mission Control’s Benchmark pane wraps the runner with a GUI for discovery, config, and log streaming.

See docs/batch_runner.md for spec syntax and report details.

Observability & Metrics API

python -m hashmap_cli serve --port 9090 --source runs/metrics_demo/metrics.ndjson --follow

  • Serves /api/metrics, /api/metrics/histogram/{latency,probe}, /api/metrics/heatmap, /api/metrics/history, and /api/events in JSON.
  • Optional Prometheus text output at /metrics (docs/prometheus_grafana.md has scrape configs, dashboards, alert examples).
  • NDJSON artifacts (--metrics-out-dir, --metrics-max-ticks) retain historical ticks for replay, export, and offline analysis.
  • Helper scripts:
    • python scripts/query_metric_endpoint.py http://127.0.0.1:9090/api/metrics [dotted.jq.path] – curl-style JSON fetcher (always pass the URL).
    • python scripts/validate_metrics_ndjson.py runs/metrics_demo/metrics.ndjson – schema validator (requires the NDJSON path). Configure run configurations in your IDE with these arguments; running “naked” will trigger the usage error banner you may have seen.

Set ADHASH_TOKEN to require Authorization: Bearer …. The browser dashboard accepts ?token= for bootstrapping, and both Mission Control & TUI automatically include the header.

Operations Runbook

  • See docs/ops/runbook.md for deployment, smoke-test, and release procedures (Mission Control, TUI, and CLI).
  • make smoke remains the fastest end-to-end validation—run it after config changes and before tagging a release.

Security Considerations

  • Snapshots: versioned header + checksum, restricted unpickler. Treat third-party files as untrusted; the inspector surfaces checksum mismatches.
  • Tokens: metrics dashboard enforces bearer tokens when ADHASH_TOKEN is set. No built-in TLS—front with your own reverse proxy for remote exposure.
  • Guardrails: load-factor / probe / tombstone thresholds trigger alerts in logs, JSON, dashboards, TUI, and Mission Control banners.

Validation

Run locally before every push/release:

make lint   # ruff
make type   # mypy
make test   # pytest (90 passed / 7 skipped as of Oct 10 2025)
# Docker smoke (pending manual run):
#   docker build -t adaptive-hashmap-cli:local -f docker/Dockerfile .
#   docker compose -f docker/docker-compose.yml up --build
# Start Docker Desktop first (or another daemon) before running the commands above.

Additional smoke:

  • make smoke – generates a 2k-op workload and validates metrics output.
  • python scripts/validate_metrics_ndjson.py runs/metrics_demo/metrics.ndjson – asserts schema compliance (metrics.v1).
  • Capture release notes with Towncrier: add a fragment under newsfragments/ (e.g., feature/1234.add-probe-panel.rst) and run make release before tagging to update docs/CHANGELOG.md.

Comprehensive command transcripts live in audits/audit.md and reports/. reports/command_run_log.tsv captures every automated/manual invocation with timestamps and status codes.

Repository Map

├── README.md                      # this guide (kept in sync with audits)
├── pyproject.toml                 # project metadata, dependencies, console scripts
├── Makefile                       # lint/type/test/build shortcuts
├── LICENSE                        # Apache 2.0 license text
├── NOTICE                         # third-party attributions
├── mypy.ini                       # static type checker configuration
├── config/                        # generated configs captured in docs and audits
├── data/                          # sample workloads and config fixtures
├── docker/                        # container definitions, compose file, entrypoint
├── docs/                          # documentation, guides, release notes
│   ├── containers/README.md       # container and deployment reference
│   ├── examples/                  # batch runner specs used across walkthroughs
│   ├── ops/runbook.md             # operations playbook for releases
│   └── upgrade.md                 # roadmap and phased milestones
├── audits/                        # narrative audit logs and external reviews
├── reports/                       # command transcripts, HTML/Markdown audit outputs
├── results/                       # JSON summaries, A/B comparisons, dashboards
├── runs/                          # generated artifacts (metrics, snapshots, configs)
├── snapshots/                     # sample serialized maps for demos/tests
├── scripts/                       # helper launchers and tooling (Mission Control, TUI, metrics)
├── newsfragments/                 # Towncrier release note fragments
├── build/                         # local build artifacts (wheel/lib staging)
├── dist/                          # distributable archives produced by builds
├── src/                           # source packages
│   ├── adhash/                    # core package (CLI, data structures, UIs)
│   │   ├── cli/                   # CLI command wiring and orchestration
│   │   ├── core/                  # hashmap algorithms and utilities
│   │   ├── metrics/               # metrics server and schema helpers
│   │   ├── mission_control/       # PyQt6 desktop application
│   │   ├── tui/                   # Textual terminal UI
│   │   ├── workloads/             # workload generation and profiling tools
│   │   └── hashmap_cli.py         # console entry point for `hashmap-cli`
│   └── hashmap_cli/__init__.py    # namespace marker used by console scripts
└── tests/                         # pytest suite (CLI, GUI, metrics, snapshots)

Documentation & Audits

Contributing / Next Steps

  1. Keep lint/type/test spotless; add new tests alongside features (UI widgets have Qt smoke tests under tests/).
  2. Update audits/audit.md and reports/command_run_log.tsv when adding major commands or artifacts.
  3. Phase 3 work (deployment & integration) is tracked in docs/upgrade.md: Docker packaging, release automation, Helm/Compose templates, etc.

Questions or patches?

  • Just open an issue or PR, just include the commands/tests you ran and highlight any schema or snapshot changes.

License

Copyright © 2025 Justin Guida Licensed under the Apache License, Version 2.0. See LICENSE for details and NOTICE for attribution.

About

End-to-end toolkit for smart hash maps, CLI, Mission Control (PyQt6) dashboard, Textual TUI, workload analytics, snapshots, and automated benchmarks.

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published