🔑 Adaptive Hash Map Studio

Adaptive HashMap Studio is an end-to-end playground for modern hash map design. It packages production-ready data structures, CSV-driven workload tooling, automated benchmarks, and rich UIs (PyQt6 Mission Control + Textual TUI) into a single repository. Every feature is exercised through lint/type/test gates and captured in reproducible audits.

Work in progress: Phase 0–2 are complete and thoroughly audited; Phase 3 (deployment & integration) is underway. Expect frequent updates as we continue hardening the platform.

Key capabilities include:

Three map backends (two-level chaining, Robin Hood, adaptive hybrid) with live migration and guardrails.
CSV workload generator, profiler, and replay engine with latency reservoirs, JSON summaries, and metrics streaming.
Snapshot lifecycle: save/load, versioned header + checksum, offline compaction and safe repair.
Mission Control desktop app featuring telemetry charts, config editor, benchmark suite manager, workload DNA explorer, and a new snapshot inspector with historical replay controls.
Textual TUI for terminal dashboards, batch benchmark runner, Prometheus-compatible metrics server, and inspect-snapshot CLI inspections.

Everything in this README is current as of the latest audit (October 10 2025)

Requirements

Python: 3.11 or 3.12 (we test against both). Python 3.9/3.10 are no longer supported.
OS: macOS or Linux shell. Windows users should work inside WSL2.
Optional extras:
- PyQt6, pyqtgraph, numpy for Mission Control (pip install .[gui]).
- textual, rich for the terminal dashboard (pip install .[ui]).
- Prometheus / Grafana if you plan to ingest /metrics output.

Install everything for development:

python -m venv .venv
source .venv/bin/activate
python -m pip install -U pip
python -m pip install -e .[dev,gui,ui]

To capture subprocess coverage locally, export the helper path before running coverage:

export PYTHONPATH="tools/coverage:$PYTHONPATH"
export COVERAGE_PROCESS_START=.coveragerc

The editable install registers a hashmap-cli entry point, and you can also invoke the CLI via python -m hashmap_cli … if you prefer module execution.

Quick Start

Generate a config (TOML is the canonical format):

python -m hashmap_cli config-wizard --outfile config/config.toml

Dry-run a workload to catch CSV issues without executing it:

python -m hashmap_cli run-csv --csv data/workloads/demo.csv --dry-run

Replay with metrics + JSON summary:

python -m hashmap_cli --config config/config.toml run-csv \
  --csv data/workloads/demo.csv \
  --json-summary-out results/json/demo_metrics_session.json \
  --metrics-out-dir runs/metrics_demo

Launch Mission Control to inspect metrics, snapshots, and configs:
```
python -m hashmap_cli mission-control
```

Sanity-check snapshots from the CLI or GUI:

python -m hashmap_cli inspect-snapshot --in snapshots/uniform.pkl.gz --limit 10

Visualise probe paths to understand collision behaviour:
```
python -m hashmap_cli --mode fast-lookup probe-visualize \
  --operation get --key K1 \
  --snapshot snapshots/uniform.pkl.gz
```
Add --seed KEY=VALUE (repeatable) to pre-populate an ad-hoc map, or --json to receive a machine-readable trace.

Run the regression suite (all must pass before pushing):

make lint
make type
make test   # pytest (90 passed / 7 skipped as of Oct 10 2025)

Running on Windows runners? Set PYTEST_TIMEOUT_METHOD=thread (or edit pytest.ini) because the default signal method is unavailable on Windows.

All three commands are logged in reports/command_run_log.tsv along with the full async audit transcripts.

Mission Control (PyQt6)

python -m hashmap_cli mission-control

Major panels:

Telemetry – Throughput, load-factor, latency histogram, probe distribution, key heatmap, FFT analytics. The header reflects the exact tick/series you’re viewing. The “Keep history between runs” toggle preserves or resets charts between workloads.
Config Editor – Schema-driven editor with preset refresh/save, fully synced with CLI tooling.
Snapshot Inspector – Load .pkl/.pkl.gz, review header metadata, filter/search keys, export history, and monitor load-factor thresholds per snapshot.
Benchmark Suites – Discover and run TOML specs using the batch runner, with live logs and workload DNA analysis.
Workload DNA – Inspect CSV characteristics (ratios, entropy, hot keys) before execution.
Probe Visualizer – Load JSON traces exported by probe-visualize, or run the CLI with --json inside the Run Command panel to stream paths directly into Mission Control.

Mission Control honours ADHASH_TOKEN for metrics authentication and gracefully handles headless environments (off-screen smoke tests live in tests/test_mission_control_widgets_qt.py).

Mission Control Telemetry Dashboard

This view shows real-time throughput, latency, and probe distribution analytics during live workload runs.

Config Editor Panel

The Config Editor provides a live, schema-driven interface for adjusting backend parameters, saving configuration presets, and synchronizing values with the CLI and TOML configuration files. It eliminates the need to manually edit config/config.toml and ensures all key-value pairs are validated before runtime.

Snapshot Inspector

The Snapshot Inspector provides an interactive viewer for exploring serialized benchmark snapshots (.pkl, .pkl.gz) generated by the CLI or batch runner. It allows developers to quickly audit workload state, review load-factor metadata, and filter individual keys without manually unpacking pickle files.

Benchmark Suites

The Benchmark Suites panel provides a visual batch runner and report generator for executing multi-job TOML specifications. It’s designed for high-throughput testing of multiple backends, parameter combinations, or workload types in a single automated session.

Workload DNA Visualizer

The Workload DNA visualizer provides an interactive graphical analysis of workload structure and hash-map distribution characteristics. It translates raw workload CSV statistics into color-coded histograms that reveal key distribution, probe depth, and entropy patterns in real time.

Containers & Deployment

Build the production image: make docker-build (multi-stage, non-root, health-checked).
Launch the metrics dashboard quickly:
```
docker run --rm \
  -p 9090:9090 \
  -e ADHASH_METRICS_PORT=9090 \
  adaptive-hashmap-cli:local serve \
  --host 0.0.0.0 \
  --port 9090
```
Binding to 0.0.0.0 ensures the mapped port is reachable from the host OS; swap in another host or port if you run behind a reverse proxy.

Replay a workload inside the container:

docker run --rm \
  -v "$(pwd)/data:/data:ro" \
  -v "$(pwd)/snapshots:/snapshots" \
  -e ADHASH_METRICS_PORT=9090 \
  adaptive-hashmap-cli:local run-csv \
    --csv /data/workloads/w_uniform.csv \
    --metrics-port 9090 \
    --metrics-host 0.0.0.0 \
    --metrics-out-dir /snapshots/metrics

Compose stack (metrics + workload runner): docker compose up --build (defaults expect data/workloads/demo.csv).
Developer-friendly image: make docker-build-dev or docker build -f docker/Dockerfile.dev for an editable environment with .[dev] installed.
Need a random high port? Set --metrics-port auto (or ADHASH_METRICS_PORT=auto) and the CLI will log the bound port after startup.

See docs/containers/README.md for environment variables, health checks, and release automation hooks.

Hands-On Walkthrough

The following mini-tour lets you experience every major feature with real commands. Feel free to copy/paste line-by-line.

1. Generate & profile workloads

# Step 1A: create a config and adjust thresholds quickly
python -m hashmap_cli config-wizard --outfile runs/demo_config.toml

# Step 1B: generate workloads with different access patterns
mkdir -p runs/workloads
python -m hashmap_cli generate-csv --outfile runs/workloads/w_uniform.csv --ops 50000 --read-ratio 0.8 --seed 7
python -m hashmap_cli generate-csv --outfile runs/workloads/w_skew_adv.csv --ops 50000 --read-ratio 0.6 --key-skew 1.2 --adversarial-ratio 0.15

# Step 1C: profile to see which backend is recommended
python -m hashmap_cli profile --csv runs/workloads/w_uniform.csv
python -m hashmap_cli profile --csv runs/workloads/w_skew_adv.csv --then get HOTKEY

2. Replay workloads with metrics + snapshots

# Step 2A: dry-run for validation first
python -m hashmap_cli run-csv --csv runs/workloads/w_uniform.csv --dry-run

# Step 2B: run with metrics streaming + JSON summary + snapshot
mkdir -p runs/metrics_uniform results/json snapshots
python -m hashmap_cli --config runs/demo_config.toml run-csv \
  --csv runs/workloads/w_uniform.csv \
  --metrics-port 9090 \
  --metrics-out-dir runs/metrics_uniform \
  --json-summary-out results/json/uniform_summary.json \
  --snapshot-out snapshots/uniform.pkl.gz --compress

# Step 2C: view metrics in Mission Control (new snapshot inspector + history controls)
python -m hashmap_cli mission-control

# Step 2D: confirm snapshot metadata from the CLI as well
python -m hashmap_cli inspect-snapshot --in snapshots/uniform.pkl.gz --limit 15 --key "'K1'"

3. Explore guardrails and migrations

# Step 3A: run a skewed workload to force Robin Hood migration + compaction
mkdir -p runs/metrics_skew
python -m hashmap_cli --mode adaptive run-csv \
  --csv runs/workloads/w_skew_adv.csv \
  --json-summary-out results/json/skew_summary.json \
  --metrics-out-dir runs/metrics_skew \
  --snapshot-out snapshots/skew.pkl.gz --compress

# Step 3B: inspect guardrail alerts in the TUI (optional)
python scripts/launch_tui.py --metrics-endpoint http://127.0.0.1:9090/api/metrics

# Step 3C: offline compact the Robin Hood snapshot then verify & repair
python -m hashmap_cli compact-snapshot --in snapshots/skew.pkl.gz --out snapshots/skew_compacted.pkl.gz --compress
python -m hashmap_cli verify-snapshot --in snapshots/skew_compacted.pkl.gz --repair --out snapshots/skew_repaired.pkl.gz --verbose

4. Compare configurations (A/B harness)

# Step 4A: create an alternate config (copy + tweak a field)
python -m hashmap_cli config-edit --infile runs/demo_config.toml --outfile runs/demo_config_candidate.toml --apply-preset default

# Step 4B: run paired benchmarks and collect comparison artifacts
python -m hashmap_cli ab-compare --csv runs/workloads/w_uniform.csv \
  --baseline-config runs/demo_config.toml \
  --candidate-config runs/demo_config_candidate.toml \
  --out-dir results/ab/uniform_demo

# Step 4C: surface throughput/latency deltas in the dashboard
python -m hashmap_cli serve --source results/ab/uniform_demo/artifacts/baseline/metrics/metrics.ndjson \
  --compare results/ab/uniform_demo/uniform_demo_baseline_vs_candidate.json

5. Batch suites & workload analytics

# Step 5A: run predefined suites (Markdown/HTML reports in results/)
python -m adhash.batch --spec docs/examples/batch_baseline.toml

# Step 5B: load the suite in Mission Control (Benchmark tab) to see log streaming and Workload DNA results
python -m hashmap_cli mission-control

By the end of this walkthrough you will have exercised config wizards, workload generation, live metrics (Mission Control + TUI), snapshot verification/repair, A/B comparisons, and batch reporting—the same flows covered in the automated audit.

CLI Surface Overview

Category	Commands	Notes
Core ops	`put`, `get`, `del`, `items`	Work on any backend (`--mode adaptive`, `fast-lookup`, etc.).
Workloads	`generate-csv`, `profile`, `run-csv`	`run-csv` supports snapshots, live metrics, JSON summaries, dry-run validation, throttles.
Analytics	`workload-dna`, `ab-compare`, `inspect-snapshot`	Workload DNA reports skew/collision risk; `inspect-snapshot` surfaces versioned metadata and key lookups.
Config	`config-wizard`, `config-edit`	Schema-driven generator/editor with preset management.
Snapshots	`compact-snapshot`, `verify-snapshot`	Offline compaction/repair for RobinHood maps with checksum verification.
Observability	`serve`, `mission-control`, `scripts/launch_tui.py`	Dashboard server, desktop UI, and terminal UI.

Run python -m hashmap_cli -h for the full command list with flags.

JSON Envelopes & Exit Codes

Add --json for machine-readable success payloads ({"ok": true, "command": "run-csv", ...}).
Errors surface through standard envelopes (BadInput, Invariant, Policy, IO) with stable exit codes {0,2,3,4,5}.

Snapshots & Configuration

Snapshots use a versioned header + BLAKE2b checksum (src/adhash/io/snapshot_header.py). Untrusted payloads are rejected.
Saved objects include Robin Hood/Chaining/Adaptive maps; inspect-snapshot and Mission Control’s inspector expose metadata, filtered previews, and direct key searches.
Configs are dataclass-backed (src/adhash/config.py) with env overrides. config-edit and Mission Control’s editor share the same schema and validation logic.

Typical flows:

python -m hashmap_cli --mode adaptive run-csv --csv data/workloads/w_heavy_adv.csv \
  --snapshot-out snapshots/adaptive.pkl.gz --compress

python -m hashmap_cli inspect-snapshot --in snapshots/adaptive.pkl.gz --key "'K1'" --limit 5

python -m hashmap_cli verify-snapshot --in snapshots/adaptive.pkl.gz --repair \
  --out snapshots/adaptive_repaired.pkl.gz --verbose

Textual TUI (Terminal)

python scripts/launch_tui.py --metrics-endpoint http://127.0.0.1:9090/api/metrics

Displays backend status, operations, load-factor trends, guardrail alerts, and latency percentiles directly in the terminal.
r to refresh, q to quit. Works with the same /api/metrics JSON endpoint as the dashboard.
--probe-json trace.json loads a trace on startup; press p to reload after re-exporting from the CLI.
IDE tip: if you launch this from PyCharm/VSC, make sure the run configuration includes the --metrics-endpoint argument; otherwise the script will print the usage banner and exit.

Batch Benchmark Runner

python -m adhash.batch --spec docs/examples/batch_baseline.toml

Executes multi-run suites (profilers, run-csv jobs) and emits Markdown/HTML reports under results/.
Mission Control’s Benchmark pane wraps the runner with a GUI for discovery, config, and log streaming.

See docs/batch_runner.md for spec syntax and report details.

Observability & Metrics API

python -m hashmap_cli serve --port 9090 --source runs/metrics_demo/metrics.ndjson --follow

Serves /api/metrics, /api/metrics/histogram/{latency,probe}, /api/metrics/heatmap, /api/metrics/history, and /api/events in JSON.
Optional Prometheus text output at /metrics (docs/prometheus_grafana.md has scrape configs, dashboards, alert examples).
NDJSON artifacts (--metrics-out-dir, --metrics-max-ticks) retain historical ticks for replay, export, and offline analysis.
Helper scripts:
- python scripts/query_metric_endpoint.py http://127.0.0.1:9090/api/metrics [dotted.jq.path] – curl-style JSON fetcher (always pass the URL).
- python scripts/validate_metrics_ndjson.py runs/metrics_demo/metrics.ndjson – schema validator (requires the NDJSON path). Configure run configurations in your IDE with these arguments; running “naked” will trigger the usage error banner you may have seen.

Set ADHASH_TOKEN to require Authorization: Bearer …. The browser dashboard accepts ?token= for bootstrapping, and both Mission Control & TUI automatically include the header.

Operations Runbook

See docs/ops/runbook.md for deployment, smoke-test, and release procedures (Mission Control, TUI, and CLI).
make smoke remains the fastest end-to-end validation—run it after config changes and before tagging a release.

Security Considerations

Snapshots: versioned header + checksum, restricted unpickler. Treat third-party files as untrusted; the inspector surfaces checksum mismatches.
Tokens: metrics dashboard enforces bearer tokens when ADHASH_TOKEN is set. No built-in TLS—front with your own reverse proxy for remote exposure.
Guardrails: load-factor / probe / tombstone thresholds trigger alerts in logs, JSON, dashboards, TUI, and Mission Control banners.

Validation

Run locally before every push/release:

make lint   # ruff
make type   # mypy
make test   # pytest (90 passed / 7 skipped as of Oct 10 2025)
# Docker smoke (pending manual run):
#   docker build -t adaptive-hashmap-cli:local -f docker/Dockerfile .
#   docker compose -f docker/docker-compose.yml up --build
# Start Docker Desktop first (or another daemon) before running the commands above.

Additional smoke:

make smoke – generates a 2k-op workload and validates metrics output.
python scripts/validate_metrics_ndjson.py runs/metrics_demo/metrics.ndjson – asserts schema compliance (metrics.v1).
Capture release notes with Towncrier: add a fragment under newsfragments/ (e.g., feature/1234.add-probe-panel.rst) and run make release before tagging to update docs/CHANGELOG.md.

Comprehensive command transcripts live in audits/audit.md and reports/. reports/command_run_log.tsv captures every automated/manual invocation with timestamps and status codes.

Repository Map

├── README.md                      # this guide (kept in sync with audits)
├── pyproject.toml                 # project metadata, dependencies, console scripts
├── Makefile                       # lint/type/test/build shortcuts
├── LICENSE                        # Apache 2.0 license text
├── NOTICE                         # third-party attributions
├── mypy.ini                       # static type checker configuration
├── config/                        # generated configs captured in docs and audits
├── data/                          # sample workloads and config fixtures
├── docker/                        # container definitions, compose file, entrypoint
├── docs/                          # documentation, guides, release notes
│   ├── containers/README.md       # container and deployment reference
│   ├── examples/                  # batch runner specs used across walkthroughs
│   ├── ops/runbook.md             # operations playbook for releases
│   └── upgrade.md                 # roadmap and phased milestones
├── audits/                        # narrative audit logs and external reviews
├── reports/                       # command transcripts, HTML/Markdown audit outputs
├── results/                       # JSON summaries, A/B comparisons, dashboards
├── runs/                          # generated artifacts (metrics, snapshots, configs)
├── snapshots/                     # sample serialized maps for demos/tests
├── scripts/                       # helper launchers and tooling (Mission Control, TUI, metrics)
├── newsfragments/                 # Towncrier release note fragments
├── build/                         # local build artifacts (wheel/lib staging)
├── dist/                          # distributable archives produced by builds
├── src/                           # source packages
│   ├── adhash/                    # core package (CLI, data structures, UIs)
│   │   ├── cli/                   # CLI command wiring and orchestration
│   │   ├── core/                  # hashmap algorithms and utilities
│   │   ├── metrics/               # metrics server and schema helpers
│   │   ├── mission_control/       # PyQt6 desktop application
│   │   ├── tui/                   # Textual terminal UI
│   │   ├── workloads/             # workload generation and profiling tools
│   │   └── hashmap_cli.py         # console entry point for `hashmap-cli`
│   └── hashmap_cli/__init__.py    # namespace marker used by console scripts
└── tests/                         # pytest suite (CLI, GUI, metrics, snapshots)

Documentation & Audits

audits/audit.md – authoritative verification log (12 sections + demo), refreshed Oct 2025.
docs/config.md – configuration schema and overrides.
docs/metrics_schema.md – JSON shapes for ticks, histograms, heatmaps, events.
docs/prometheus_grafana.md – integration guide for metrics exporters.
docs/batch_runner.md – benchmark specs & output format.
docs/workload_schema.md – CSV column definitions and validator behaviour.
docs/control_surface.md – REST/Python control surface design and rollout checklist.
docs/analysis/probe_visualizer.md – detailed probe tracing guide.

Contributing / Next Steps

Keep lint/type/test spotless; add new tests alongside features (UI widgets have Qt smoke tests under tests/).
Update audits/audit.md and reports/command_run_log.tsv when adding major commands or artifacts.
Phase 3 work (deployment & integration) is tracked in docs/upgrade.md: Docker packaging, release automation, Helm/Compose templates, etc.

Questions or patches?

Just open an issue or PR, just include the commands/tests you ran and highlight any schema or snapshot changes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🔑 Adaptive Hash Map Studio

Key capabilities include:

Requirements

Quick Start

Mission Control (PyQt6)

Mission Control Telemetry Dashboard

Config Editor Panel

Snapshot Inspector

Benchmark Suites

Workload DNA Visualizer

Containers & Deployment

Hands-On Walkthrough

1. Generate & profile workloads

2. Replay workloads with metrics + snapshots

3. Explore guardrails and migrations

4. Compare configurations (A/B harness)

5. Batch suites & workload analytics

CLI Surface Overview

JSON Envelopes & Exit Codes

Snapshots & Configuration

Textual TUI (Terminal)

Batch Benchmark Runner

Observability & Metrics API

Operations Runbook

Security Considerations

Validation

Repository Map

Documentation & Audits

Contributing / Next Steps

License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
.github		.github
audits		audits
config		config
data		data
docker		docker
docs		docs
newsfragments		newsfragments
reports		reports
results		results
scripts		scripts
snapshots		snapshots
src		src
stubs/jsonschema		stubs/jsonschema
tests		tests
tmp		tmp
tools		tools
.bandit.yaml		.bandit.yaml
.dockerignore		.dockerignore
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.ruff_baseline.json		.ruff_baseline.json
LICENSE		LICENSE
Makefile		Makefile
NOTICE		NOTICE
README.md		README.md
mutmut.ini		mutmut.ini
mypy.ini		mypy.ini
pyproject.toml		pyproject.toml
pytest.full.ini		pytest.full.ini
pytest.ini		pytest.ini
uv.lock		uv.lock

License

jguida941/adaptive-hashmap-studio

Folders and files

Latest commit

History

Repository files navigation

🔑 Adaptive Hash Map Studio

Key capabilities include:

Requirements

Quick Start

Mission Control (PyQt6)

Mission Control Telemetry Dashboard

Config Editor Panel

Snapshot Inspector

Benchmark Suites

Workload DNA Visualizer

Containers & Deployment

Hands-On Walkthrough

1. Generate & profile workloads

2. Replay workloads with metrics + snapshots

3. Explore guardrails and migrations

4. Compare configurations (A/B harness)

5. Batch suites & workload analytics

CLI Surface Overview

JSON Envelopes & Exit Codes

Snapshots & Configuration

Textual TUI (Terminal)

Batch Benchmark Runner

Observability & Metrics API

Operations Runbook

Security Considerations

Validation

Repository Map

Documentation & Audits

Contributing / Next Steps

License

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages