chore: add chunk bytes library in prep for singleton schema work #2627

josephschorr · 2025-10-20T22:17:28Z

Description

Adds a chunk bytes library to the common datastore code for easy storing and loading of byte data that will be chunked across multiple rows in the database, as some databases have soft or hard limits on the size of BLOB fields.

This will be used by singleton schemas to store schemas and, likely eventually, schema metadata

Testing

Unit tests. This is very early and not used yet.

codecov · 2025-10-20T22:20:47Z

Codecov Report

❌ Patch coverage is 85.16484% with 27 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.42%. Comparing base (01d72a1) to head (a94abd9).

Files with missing lines	Patch %	Lines
internal/datastore/common/chunkbytes.go	85.17%	17 Missing and 10 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2627      +/-   ##
==========================================
+ Coverage   77.39%   77.42%   +0.03%     
==========================================
  Files         449      450       +1     
  Lines       56613    56795     +182     
==========================================
+ Hits        43811    43968     +157     
- Misses      10045    10059      +14     
- Partials     2757     2768      +11

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

tstirrat15

See comments, looks good to me

tstirrat15 · 2025-10-21T21:27:09Z

internal/datastore/common/chunkbytes.go

+	}
+
+	numChunks := (len(data) + c.maxChunkSize - 1) / c.maxChunkSize
+	chunks := make([][]byte, 0, numChunks)


For my own edification: when you allocate a slice of slices, is that essentially a slice of pointers?

tstirrat15 · 2025-10-21T21:30:19Z

internal/datastore/common/chunkbytes_test.go

+	require.Contains(t, updateSQL, "SET deleted_at = ?")
+	require.Contains(t, updateSQL, "WHERE name = ?")
+	require.Contains(t, updateSQL, "AND deleted_at = ?")
+	require.Equal(t, []any{createdAt, "test-key", ^uint64(0)}, txn.capturedArgs[0])


Why is this an any? and what is ^uint64(0) supposed to represent? Is that a clever way of writing maxint?

tstirrat15 · 2025-10-21T21:33:07Z

internal/datastore/common/chunkbytes_test.go

+		},
+		{
+			name:          "multiple exact chunks",
+			data:          []byte("1234567890"),


Do these happen to line up because utf-8 is one byte per char for ascii characters?

miparnisari

I've amended your PR title from feat: to chore: because, for customers, this isn't a new feature yet

miparnisari · 2025-10-23T00:33:29Z

internal/datastore/common/chunkbytes.go

+	tableName         string
+	nameColumn        string
+	chunkIndexColumn  string
+	chunkDataColumn   string
+	maxChunkSize      int
+	placeholderFormat sq.PlaceholderFormat
+	executor          ChunkedBytesExecutor
+	writeMode         WriteMode
+	createdAtColumn   string
+	deletedAtColumn   string
+	aliveValue        T


all these fields are already defined in SQLByteChunkerConfig. why not just hold a pointer to it? this would reduce duplicate code and prevent you from having to write

return &SQLByteChunker[T]{ tableName: config.TableName, nameColumn: config.NameColumn, chunkIndexColumn: config.ChunkIndexColumn, chunkDataColumn: config.ChunkDataColumn, maxChunkSize: config.MaxChunkSize, placeholderFormat: config.PlaceholderFormat, executor: config.Executor, writeMode: config.WriteMode, createdAtColumn: config.CreatedAtColumn, deletedAtColumn: config.DeletedAtColumn, aliveValue: config.AliveValue, }

which is dangerous, because if you add a new configuration you have to remember to update that code too

miparnisari · 2025-10-23T00:41:37Z

internal/datastore/common/chunkbytes.go

+// SQLByteChunker provides methods for reading and writing byte data
+// that is chunked across multiple rows in a SQL table.
+type SQLByteChunker[T any] struct {


Suggested change

// SQLByteChunker provides methods for reading and writing byte data

// that is chunked across multiple rows in a SQL table.

type SQLByteChunker[T any] struct {

type IntOrString interface {

~uint64 | ~string

}

// SQLByteChunker provides methods for reading and writing byte data

// that is chunked across multiple rows in a SQL table.

type SQLByteChunker[T IntOrString] struct {

miparnisari · 2025-10-23T00:52:45Z

internal/datastore/common/chunkbytes.go

+	// Validate that we have all chunks from 0 to N-1
+	maxIndex := -1
+	for index := range chunks {
+		if index > maxIndex {
+			maxIndex = index
+		}
+	}
+
+	// Check for missing chunks
+	for i := 0; i <= maxIndex; i++ {
+		if _, exists := chunks[i]; !exists {
+			return nil, fmt.Errorf("missing chunk at index %d", i)
+		}
+	}
+
+	// Calculate total size
+	totalSize := 0
+	for _, chunk := range chunks {
+		totalSize += len(chunk)
+	}
+
+	// Reassemble
+	result := make([]byte, 0, totalSize)
+	for i := 0; i <= maxIndex; i++ {
+		result = append(result, chunks[i]...)
+	}


Suggested change

// Validate that we have all chunks from 0 to N-1

maxIndex := -1

for index := range chunks {

if index > maxIndex {

maxIndex = index

}

}

// Check for missing chunks

for i := 0; i <= maxIndex; i++ {

if _, exists := chunks[i]; !exists {

return nil, fmt.Errorf("missing chunk at index %d", i)

}

}

// Calculate total size

totalSize := 0

for _, chunk := range chunks {

totalSize += len(chunk)

}

// Reassemble

result := make([]byte, 0, totalSize)

for i := 0; i <= maxIndex; i++ {

result = append(result, chunks[i]...)

}

// Validate that we have all chunks from 0 to N-1 and calculate total size

maxIndex := -1

totalSize := 0

for index := range chunks {

maxIndex = max(maxIndex, index)

totalSize += len(chunks[index])

}

// Reassemble, while checking for missing chunks

result := make([]byte, 0, totalSize)

for i := 0; i <= maxIndex; i++ {

if _, exists := chunks[i]; !exists {

return nil, fmt.Errorf("missing chunk at index %d", i)

}

result = append(result, chunks[i]...)

}

josephschorr requested a review from a team as a code owner October 20, 2025 22:17

github-actions bot added area/datastore Affects the storage system area/tooling Affects the dev or user toolchain (e.g. tests, ci, build tools) labels Oct 20, 2025

feat: add chunk bytes library in prep for singleton schema work

a94abd9

josephschorr force-pushed the chunk-bytes branch from 3555c9a to a94abd9 Compare October 21, 2025 18:44

tstirrat15 approved these changes Oct 21, 2025

View reviewed changes

miparnisari changed the title ~~feat: add chunk bytes library in prep for singleton schema work~~ chore: add chunk bytes library in prep for singleton schema work Oct 23, 2025

miparnisari reviewed Oct 23, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

chore: add chunk bytes library in prep for singleton schema work #2627

chore: add chunk bytes library in prep for singleton schema work #2627

Uh oh!

josephschorr commented Oct 20, 2025

Uh oh!

codecov bot commented Oct 20, 2025 •

edited

Loading

Uh oh!

tstirrat15 left a comment

Uh oh!

tstirrat15 Oct 21, 2025

Uh oh!

tstirrat15 Oct 21, 2025

Uh oh!

tstirrat15 Oct 21, 2025

Uh oh!

miparnisari left a comment

Uh oh!

miparnisari Oct 23, 2025 •

edited

Loading

Uh oh!

miparnisari Oct 23, 2025 •

edited

Loading

Uh oh!

miparnisari Oct 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

chore: add chunk bytes library in prep for singleton schema work #2627

Are you sure you want to change the base?

chore: add chunk bytes library in prep for singleton schema work #2627

Uh oh!

Conversation

josephschorr commented Oct 20, 2025

Description

Testing

Uh oh!

codecov bot commented Oct 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

tstirrat15 left a comment

Choose a reason for hiding this comment

Uh oh!

tstirrat15 Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

tstirrat15 Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

tstirrat15 Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

miparnisari left a comment

Choose a reason for hiding this comment

Uh oh!

miparnisari Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

miparnisari Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

miparnisari Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov bot commented Oct 20, 2025 •

edited

Loading

miparnisari Oct 23, 2025 •

edited

Loading

miparnisari Oct 23, 2025 •

edited

Loading