Implement SqliteLightningStore parallel to InMemoryLightningStore #132

ultmaster · 2025-10-09T04:42:20Z

Summary

expand documentation across the SQLite span/resource helper methods to describe JSON rehydration and validation flows
add inline commentary around span persistence, running-rollout fetches, and health check status transitions to aid readability

Testing

pytest tests/store/test_sqlite.py

https://chatgpt.com/codex/tasks/task_e_68e6a5a9d70c832ebea9b07bc26e4323

Copilot

Pull Request Overview

This PR clarifies SQLite helper documentation and health check flow by expanding method documentation and adding inline commentary to improve code readability.

Added comprehensive docstrings for SQLite span/resource helper methods describing JSON rehydration and validation flows
Added inline commentary around span persistence, running-rollout fetches, and health check status transitions
Created comprehensive test coverage for the SQLite store implementation

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.

File	Description
agentlightning/store/sqlite.py	Added detailed docstrings and inline comments to clarify SQLite helper methods and health check flow
tests/store/test_sqlite.py	Added comprehensive test suite for SQLite store functionality
tests/store/conftest.py	Added SQLite store fixture and cleaned up existing fixtures

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-10-09T04:43:38Z

agentlightning/store/sqlite.py

+                creates an isolated shared-memory database, while using a file
+                path persists state across restarts.
+        """
+        if path == ":memory":


The comparison uses :memory but should use :memory: (with colon) to match SQLite's in-memory database syntax.

Suggested change

if path == ":memory":

if path == ":memory:":

Copilot · 2025-10-09T04:43:38Z

agentlightning/store/sqlite.py

+            payload.update(extras)
+        payload["status"] = TraceStatus.model_validate(payload["status"])
+        resource_raw = _json_loads(row["resource_json"])
+        payload["resource"] = resource_raw


Line 442 assigns resource_raw but then payload['resource'] is immediately overwritten on line 443. This creates unnecessary duplication and potential confusion.

Suggested change

payload["resource"] = resource_raw

Copilot · 2025-10-09T04:43:39Z

tests/store/test_sqlite.py

+@pytest.mark.asyncio
+async def test_schema_has_three_tables(sqlite_store: SqliteLightningStore) -> None:
+    def reader(connection) -> List[str]:
+        rows = connection.execute(
+            "SELECT name FROM sqlite_master WHERE type='table' AND name NOT LIKE 'sqlite_%'"
+        ).fetchall()
+        return sorted(row["name"] for row in rows)
+
+    tables = await sqlite_store._execute_read(reader)  # type: ignore[attr-defined]
+    assert tables == ["attempts", "rollouts", "spans"]
+
+


Accessing private method _execute_read violates encapsulation. Consider adding a public method for testing database schema or testing this functionality indirectly.

Suggested change

@pytest.mark.asyncio

async def test_schema_has_three_tables(sqlite_store: SqliteLightningStore) -> None:

def reader(connection) -> List[str]:

rows = connection.execute(

"SELECT name FROM sqlite_master WHERE type='table' AND name NOT LIKE 'sqlite_%'"

).fetchall()

return sorted(row["name"] for row in rows)

tables = await sqlite_store._execute_read(reader) # type: ignore[attr-defined]

assert tables == ["attempts", "rollouts", "spans"]

# @pytest.mark.asyncio

# async def test_schema_has_three_tables(sqlite_store: SqliteLightningStore) -> None:

# """

# This test accessed the private method _execute_read, violating encapsulation.

# It has been removed to comply with best practices.

# If a public method for schema inspection is added to SqliteLightningStore,

# this test can be reinstated.

# """

Copilot · 2025-10-09T04:43:39Z

tests/store/test_sqlite.py

+    def reader(connection) -> Any:
+        return connection.execute(
+            "SELECT attributes_json FROM spans WHERE rollout_id=? AND attempt_id=?",
+            (span.rollout_id, span.attempt_id),
+        ).fetchone()[0]
+
+    encoded = await sqlite_store._execute_read(reader)  # type: ignore[attr-defined]
+    data = json.loads(encoded)
+    assert data["binary"]["__type__"] == "bytes"
+    fetched = await sqlite_store.query_spans(span.rollout_id)
+    assert fetched[0].attributes["binary"] == b"\x00\x01"
+
+


Accessing private method _execute_read violates encapsulation. Consider testing JSON storage through public APIs instead.

Suggested change

def reader(connection) -> Any:

return connection.execute(

"SELECT attributes_json FROM spans WHERE rollout_id=? AND attempt_id=?",

(span.rollout_id, span.attempt_id),

).fetchone()[0]

encoded = await sqlite_store._execute_read(reader) # type: ignore[attr-defined]

data = json.loads(encoded)

assert data["binary"]["__type__"] == "bytes"

fetched = await sqlite_store.query_spans(span.rollout_id)

assert fetched[0].attributes["binary"] == b"\x00\x01"

# Use the public API to fetch the span and check the encoding

fetched = await sqlite_store.query_spans(span.rollout_id)

assert fetched

# The attributes should contain the binary data as bytes

assert fetched[0].attributes["binary"] == b"\x00\x01"

# Optionally, check that the encoding type is present if the public API exposes it

if isinstance(fetched[0].attributes["binary"], dict):

assert fetched[0].attributes["binary"]["__type__"] == "bytes"

Copilot · 2025-10-09T04:43:39Z

tests/store/test_sqlite.py

+    attempted = await sqlite_store.start_rollout(input={"task": "timeout"})
+
+    await sqlite_store.update_rollout(
+        attempted.rollout_id,
+        config=RolloutConfig(timeout_seconds=0.01, max_attempts=1, retry_condition=[]),
+    )
+
+    def writer(connection) -> None:
+        connection.execute(
+            "UPDATE attempts SET start_time=? WHERE attempt_id=?",
+            (time.time() - 5, attempted.attempt.attempt_id),
+        )
+
+    await sqlite_store._execute_write(writer)  # type: ignore[attr-defined]
+


Accessing private method _execute_write violates encapsulation. Consider using public methods to manipulate test data.

Suggested change

attempted = await sqlite_store.start_rollout(input={"task": "timeout"})

await sqlite_store.update_rollout(

attempted.rollout_id,

config=RolloutConfig(timeout_seconds=0.01, max_attempts=1, retry_condition=[]),

)

def writer(connection) -> None:

connection.execute(

"UPDATE attempts SET start_time=? WHERE attempt_id=?",

(time.time() - 5, attempted.attempt.attempt_id),

)

await sqlite_store._execute_write(writer) # type: ignore[attr-defined]

# Patch time.time so that the attempt's start_time is far enough in the past to trigger a timeout

from unittest.mock import patch

old_time = time.time() - 5

with patch("time.time", return_value=old_time):

attempted = await sqlite_store.start_rollout(input={"task": "timeout"})

await sqlite_store.update_rollout(

attempted.rollout_id,

config=RolloutConfig(timeout_seconds=0.01, max_attempts=1, retry_condition=[]),

)

Copilot

Pull Request Overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-10-09T06:09:06Z

agentlightning/store/sqlite.py

+                creates an isolated shared-memory database, while using a file
+                path persists state across restarts.
+        """
+        if path == ":memory":


The condition checks for :memory but the docstring indicates :memory: (with colon) creates an in-memory database. This will cause file-based databases to be created when :memory: is passed.

Suggested change

if path == ":memory":

if path == ":memory:":

Copilot · 2025-10-09T06:09:07Z

agentlightning/store/sqlite.py

+        if isinstance(extras, dict):
+            payload.update(extras)
+        payload["status"] = TraceStatus.model_validate(payload["status"])
+        resource_payload = _json_loads(row["resource_json"])


The resource_json field is loaded twice - once in line 433 and again in line 439. This is redundant and should be consolidated to use the existing payload['resource'] value.

Suggested change

resource_payload = _json_loads(row["resource_json"])

resource_payload = payload["resource"]

Copilot · 2025-10-09T06:09:07Z

tests/store/test_sqlite.py

+from agentlightning.store.sqlite import SqliteLightningStore
+from agentlightning.types import PromptTemplate, Resource, RolloutConfig, Span, TraceStatus
+
+_process_store: SqliteLightningStore | None = None


Global variable _process_store creates shared mutable state that could cause issues in concurrent test execution. Consider using a different approach for cross-process communication.

Copilot · 2025-10-09T06:09:07Z

tests/store/test_sqlite.py

+    def reader(connection: sqlite3.Connection) -> List[str]:
+        rows: list[sqlite3.Row] = connection.execute(
+            "SELECT name FROM sqlite_master WHERE type='table' AND name NOT LIKE 'sqlite_%'"
+        ).fetchall()
+        return sorted([str(row["name"]) for row in rows])
+
+    tables = await sqlite_store._execute_read(reader)  # type: ignore[attr-defined]
+    assert tables == ["attempts", "rollouts", "spans"]
+
+


Accessing private method _execute_read in tests breaks encapsulation. Consider adding a public method for schema inspection or testing this functionality indirectly.

Suggested change

def reader(connection: sqlite3.Connection) -> List[str]:

rows: list[sqlite3.Row] = connection.execute(

"SELECT name FROM sqlite_master WHERE type='table' AND name NOT LIKE 'sqlite_%'"

).fetchall()

return sorted([str(row["name"]) for row in rows])

tables = await sqlite_store._execute_read(reader) # type: ignore[attr-defined]

assert tables == ["attempts", "rollouts", "spans"]

# Indirectly verify that the tables exist by using public API methods

# Enqueue a rollout and add a span, then check that querying works

rollout = await sqlite_store.enqueue_rollout(input={"idx": 1})

attempt = await sqlite_store.get_latest_attempt(rollout.rollout_id)

assert attempt is not None

span = _make_span(rollout.rollout_id, attempt.attempt_id, sequence_id=1)

await sqlite_store.add_span(span)

# Check that querying rollouts, attempts, and spans returns expected results

rollouts = await sqlite_store.query_rollouts()

assert any(r.rollout_id == rollout.rollout_id for r in rollouts)

attempts = await sqlite_store.query_attempts(rollout.rollout_id)

assert any(a.attempt_id == attempt.attempt_id for a in attempts)

spans = await sqlite_store.query_spans(rollout.rollout_id)

assert any(s.span_id == span.span_id for s in spans)

Copilot · 2025-10-09T06:09:07Z

tests/store/test_sqlite.py

+    def reader(connection: sqlite3.Connection) -> str:
+        row = connection.execute(
+            "SELECT attributes_json FROM spans WHERE rollout_id=? AND attempt_id=?",
+            (span.rollout_id, span.attempt_id),
+        ).fetchone()
+        assert row is not None
+        encoded = row[0]
+        assert isinstance(encoded, str)
+        return encoded
+
+    encoded = await sqlite_store._execute_read(reader)  # type: ignore[attr-defined]
+    data = json.loads(encoded)
+    assert data["binary"]["__type__"] == "bytes"
+    fetched = await sqlite_store.query_spans(span.rollout_id)
+    assert fetched[0].attributes["binary"] == b"\x00\x01"


Another instance of accessing private method _execute_read. This pattern appears multiple times in the test file and should be refactored.

Suggested change

def reader(connection: sqlite3.Connection) -> str:

row = connection.execute(

"SELECT attributes_json FROM spans WHERE rollout_id=? AND attempt_id=?",

(span.rollout_id, span.attempt_id),

).fetchone()

assert row is not None

encoded = row[0]

assert isinstance(encoded, str)

return encoded

encoded = await sqlite_store._execute_read(reader) # type: ignore[attr-defined]

data = json.loads(encoded)

assert data["binary"]["__type__"] == "bytes"

fetched = await sqlite_store.query_spans(span.rollout_id)

assert fetched[0].attributes["binary"] == b"\x00\x01"

# Instead of accessing the private method, use the public query_spans method

fetched_spans = await sqlite_store.query_spans(span.rollout_id)

assert fetched_spans, "No spans returned from query_spans"

encoded = fetched_spans[0].attributes_json

data = json.loads(encoded)

assert data["binary"]["__type__"] == "bytes"

assert fetched_spans[0].attributes["binary"] == b"\x00\x01"

Copilot · 2025-10-09T06:09:08Z

tests/store/test_sqlite.py

+    def writer(connection: sqlite3.Connection) -> None:
+        connection.execute(
+            "UPDATE attempts SET start_time=? WHERE attempt_id=?",
+            (time.time() - 5, attempted.attempt.attempt_id),
+        )
+
+    await sqlite_store._execute_write(writer)  # type: ignore[attr-defined]
+


Accessing private method _execute_write in tests. Consider using public APIs or adding test-specific methods to avoid breaking encapsulation.

Suggested change

def writer(connection: sqlite3.Connection) -> None:

connection.execute(

"UPDATE attempts SET start_time=? WHERE attempt_id=?",

(time.time() - 5, attempted.attempt.attempt_id),

)

await sqlite_store._execute_write(writer) # type: ignore[attr-defined]

# Instead of directly updating the start_time via a private method,

# simulate a timeout by waiting for the timeout to occur naturally.

# The timeout_seconds is set to a very low value above.

await asyncio.sleep(0.05) # Wait longer than the timeout to ensure it triggers.

# Now check the status as before.

ultmaster · 2025-10-25T05:02:28Z

/ci

github-actions · 2025-10-25T05:02:48Z

✅ CI retrigger requested by @ultmaster. Closed & reopened the PR to fire pull_request: reopened.

Clarify SQLite store helpers and health check flow

df0eb30

Copilot AI review requested due to automatic review settings October 9, 2025 04:42

ultmaster added the codex label Oct 9, 2025 — with ChatGPT Codex Connector

ultmaster changed the title ~~Clarify SQLite helper documentation and healthcheck flow~~ Implement SqliteLightningStore parallel to InMemoryLightningStore Oct 9, 2025

Copilot AI reviewed Oct 9, 2025

View reviewed changes

ultmaster added 3 commits October 8, 2025 21:53

Add required copyright headers

06eebbf

Fix pyright typing for SQLite store

f4def2f

Fix typing annotations in SQLite store tests

2dac228

ultmaster requested a review from Copilot October 9, 2025 06:07

Copilot AI reviewed Oct 9, 2025

View reviewed changes

github-actions bot closed this Oct 25, 2025

github-actions bot reopened this Oct 25, 2025

	resource_payload = _json_loads(row["resource_json"])
	resource_payload = payload["resource"]

-    def reader(connection: sqlite3.Connection) -> List[str]:
-        rows: list[sqlite3.Row] = connection.execute(
-            "SELECT name FROM sqlite_master WHERE type='table' AND name NOT LIKE 'sqlite_%'"
-        ).fetchall()
-        return sorted([str(row["name"]) for row in rows])
-    tables = await sqlite_store._execute_read(reader)  # type: ignore[attr-defined]
-    assert tables == ["attempts", "rollouts", "spans"]
+    # Indirectly verify that the tables exist by using public API methods
+    # Enqueue a rollout and add a span, then check that querying works
+    rollout = await sqlite_store.enqueue_rollout(input={"idx": 1})
+    attempt = await sqlite_store.get_latest_attempt(rollout.rollout_id)
+    assert attempt is not None
+    span = _make_span(rollout.rollout_id, attempt.attempt_id, sequence_id=1)
+    await sqlite_store.add_span(span)
+    # Check that querying rollouts, attempts, and spans returns expected results
+    rollouts = await sqlite_store.query_rollouts()
+    assert any(r.rollout_id == rollout.rollout_id for r in rollouts)
+    attempts = await sqlite_store.query_attempts(rollout.rollout_id)
+    assert any(a.attempt_id == attempt.attempt_id for a in attempts)
+    spans = await sqlite_store.query_spans(rollout.rollout_id)
+    assert any(s.span_id == span.span_id for s in spans)

Uh oh!

Implement SqliteLightningStore parallel to InMemoryLightningStore #132

Are you sure you want to change the base?

Implement SqliteLightningStore parallel to InMemoryLightningStore #132

Uh oh!

Conversation

ultmaster commented Oct 9, 2025

Summary

Testing

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Copilot AI Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

ultmaster commented Oct 25, 2025

Uh oh!

github-actions bot commented Oct 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants