-
Notifications
You must be signed in to change notification settings - Fork 73
Description
Feature Area
AI Services / embedding
Painpoint
embedding is very slow for many documents, it can only make so many requests so quickly and ollama is designed to handle them one by one not simultaneously.
i have many, many documents and i am working on converting many pdfs to markdown to be embedded and embedding takes a very long time, it also randomly has errors, sometimes has to be rebuilt, i feel like i spend more time waiting for embedding than i ever do using the tools.
Describe your idea
allow a potential 2nd or even 3rd instance of ollama to be specified. same as the first just more than one, so embedding requests can be distributed across multiple instances of ollama.
this could potentially even be done on one single machine with docker for example, but for those with multiple machines on the network, it could distribute embedding requests across them. essentially it would cut the embedding time into half or one 3rd. (or more if more instances were allowed)
Alternatives
No response
Additional Context
No response