Skip to content

[FR] Bulk download between recent checkpoints? #2098

@woodruffw

Description

@woodruffw

I had this idea while playing around with my own monitoring tool, curious to hear what the Rekor folks think 🙂 -- if you think it's too complicated or otherwise not worth the effort please close!

Description

Right now, a real-time log monitor might have an event loop like this:

  1. Persist the last observed checkpoint
  2. Wait until a new checkpoint appears
  3. Audit all entries in the range [old, new)

To do (3), the monitor calls /api/v1/log/entries/retrieve repeatedly for ranges of indices in [old, new), which each call only handling a maximum of 10 indices. Current typical checkpoint ranges include a few hundred entries, meaning that the retrieval loop takes a decent amount of time (and that monitoring requires more fallible network round-trips than strictly necessary).

My proposal: For the last N checkpoints (pick N to balance size tradeoffs), Rekor could bundle the entries between adjacent checkpoints into singular payloads. These payloads could then be made available via an endpoint like /api/v1/log/entries/retrieve/by-checkpoints (or similar), where the request to that endpoint specifies the checkpoint span.

Pros:

  • In the "happy" case, this would reduce the order of monitor network requests to Rekor from O(N) to O(1), making the monitor faster and reducing pressure on Rekor (this may not be significant anyways)

Cons:

  • Additional storage requirements on Rekor's side, along with a small amount of server complexity
  • In the "sad" case (where a monitor is catching up or missed a checkpoint for whatever reason), the network request order degrades back to O(N). This could be addressed through an even more clever "windowing" approach (where Rekor bundles the entire last N checkpointed entries into one giant payload and offers ranges over it), but this is even more complicated.

TL;DR: Rekor could bundle ranges between pairs of recent checkpoints to accelerate a common monitor retrieval pattern. This would reduce network traffic and improve monitor performance, at the cost of some additional storage and server complexity.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions