Optimized Embeddings CPU Usage #43

satp42 · 2025-10-24T22:08:04Z

Added new settings to control embedding performance in packages/types/src/config.types.ts. Specifically:

embedding_batch_size (number, default: 64)
embedding_max_threads (number, default: 4)
embedding_max_connections (number, default: 8)

Modified packages/backend-server/src/main.rs to accept additional command-line arguments for batch size, max threads, and max connections.

Updated packages/backend-server/src/server/mod.rs:

Added max_connections field to LocalAIServer
Implemented a semaphore or connection counter to limit concurrent client connections (currently unlimited thread spawning)
Configured rayon global thread pool using rayon::ThreadPoolBuilder before starting the server

Modified packages/backend-server/src/embeddings/model.rs:

Added batch_size field to EmbeddingModel struct
Replaced hardcoded batch size Some(1) at line 71 with configurable self.batch_size

Passed Configuration from Electron Main Process

Implemented Lazy Embeddings for Large Document Types

Extended lazy embeddings logic to include ResourceTextContentType::PDF, ResourceTextContentType::Document, and ResourceTextContentType::Article
These document types will get a generateLazyEmbeddings tag instead of immediate embedding generation
Embeddings will then be generated on-demand when documents are accessed in chat/search

Optimized Chunking Strategy

Increased max_chunk_size from 2000 to 2500 characters (reduces total chunks by ~20% while maintaining quality)
Kept overlap_sentences at 1 for continuity
This change reduced the number of embeddings needed per document

The expected impact of this PR:

Batch size increase (1 → 64): reduction in CPU overhead due to better model utilization
Thread pool limits: Prevents CPU saturation, keeps usage under control
Connection limits: Prevents thread explosion during bulk uploads
Lazy embeddings for large docs: Defers expensive operations until needed
Larger chunks (2000 → 2500): fewer embeddings to generate and store

Related to #28

finalized changes

479914f

satp42 marked this pull request as draft October 24, 2025 22:10

satp42 marked this pull request as ready for review October 24, 2025 22:11

satp42 marked this pull request as draft October 24, 2025 22:12

satp42 marked this pull request as ready for review October 24, 2025 22:13

aavshr self-requested a review October 31, 2025 12:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimized Embeddings CPU Usage #43

Optimized Embeddings CPU Usage #43

satp42 commented Oct 24, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Optimized Embeddings CPU Usage #43

Are you sure you want to change the base?

Optimized Embeddings CPU Usage #43

Conversation

satp42 commented Oct 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

satp42 commented Oct 24, 2025 •

edited

Loading