A data ingestion, retrieval, and analysis backend for microcontroller sensor streams (ESP32) with initial support for RAG-style natural language querying.
This repository provides:
-
a FastAPI server for ingesting and querying sensor data
-
tools for semantic indexing and retrieval via LLM agent
-
a planned UI component (Next.js frontend integration) for data visualization and interaction
This project is evolving toward a production-ready sensor data platform with LLM chat interaction and analytics.
The esp32_api project is designed to collect, store, and query time-series sensor data from ESP32 microcontrollers. It also integrates an agentic AI framework that allows users to ask open-ended questions about trends and anomalies in the data, experimenting with both Retrieval Augmented Generation (RAG) and natural language-to-SQL (NLSQL) strategies.
Key aspects of the stack include:
-
FastAPI backend with REST endpoints (
/ingest,/timeseries) -
PostgreSQL database for sensor storage and snapshots
-
Vector embedding and semantic retrieval endpoints (
/rag/query,/rag/index,/rag/ingest_docs) -
Planned integration with a React/Next.js frontend for visualization and agent interaction
esp32_api/
├── server/ # Backend service (FastAPI)
│ ├── app/ # Core application modules
│ ├── rag/ # RAG & LLM logic
│ └── main.py # Data ingestion & status
├── device/ # MicroPython scripts for ESP32
├── ui/ # Placeholder for future frontend
├── docs/ # Project documentation
├── .env.example # Example environment vars
└── README.md # This file
POST /ingest
Ingests sensor payloads from ESP32 devices.
GET /timeseries
Returns filtered time-series data based on query parameters (sensor, from, to, avg, etc.).
-
POST /rag/query— Chatbot that answers questions by generating SQL queries for the data and searching contextual documents. -
POST /rag/index— Batch embedding of time-series snapshots into vector store so LLM can answer data questions without SQL. -
POST /rag/ingest_docs— Splits up PDFs and web pages into small, overlapping text chunks and embeds them in a vector DB.
(Detailed request/response schemas to be documented in /docs/endpoints/*.md.)
The system is composed of the following high-level components:
ESP32 Microcontroller
↓ (posts data live via HTTP)
FastAPI Backend ── PostgreSQL ── Snapshots & Raw Data
├─ /ingest, /timeseries
└─ /rag/index, /rag/query, /rag/ingest_docs
├─ SQL + Time-Series Logic
└─ RAG / Embeddings
Front-end (Next.js / React / TypeScript) — visualization & agent interaction
The planned frontend may integrate the Vercel/Next AI SDK, which would move the /rag endpoints to Front-end.
-
Python 3.11+
-
PostgreSQL instance with credentials available (e.g. Supabase)
-
pgvectorextension installed in PostgreSQL for vector embeddings (e.g. in Supabase, installvectorunder Database > Extensions)
Clone the repository and install dependencies:
git clone https://github.com/postoccupancy/esp32_api.git
cd esp32_api
pip install -r server/requirements.txt
Set up environment variables (see Configuration below), then start the API:
uvicorn server.main:app --host 0.0.0.0 --port 8000 --reload
Environment variables are used to control database connections. Copy .env.example to .env and fill in the required values.
This repository follows a structured documentation layout inspired by best practices. The primary documentation is housed under the docs/ folder. Key sections include:
Notes & Tutorials
-
docs/2025-12-17-open-source-agent-stack.md— Mapping out the open source agent development toolkit. -
docs/2026-01-27-rag-setup-and-next-steps.md— Discussion of how the RAG endpoints work, and how I plan to use them in an AI-enhanced data visualization interface. -
(Future) additional notes covering architectural decisions and integrations.
Contributions are welcome! For structured guidelines, see the CONTRIBUTING.md once created. For now:
-
Fork the repo
-
Create a descriptive branch
-
Open a pull request with context and tests (when available)
This project is open source and released under the BSD-3 Clause License.
More documentation planned...