Skip to content

Conversation

@fbugarski
Copy link
Contributor

No description provided.

@dborovcanin dborovcanin marked this pull request as ready for review December 31, 2025 10:33
Copy link
Contributor

@dborovcanin dborovcanin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not see a lot of documentation about proxy. Is it added in previous PRs?

Comment on lines 46 to 47
Cube AI embeddings are generated inside **Trusted Execution Environments (TEEs)**,
ensuring that both input text and resulting vectors remain confidential.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a bit more context to it. Link to TEE, a brief explanation how it protects workload and why it's important.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this to the later section, so we have a flow:

  • RAG intro
  • RAG explanation
  • TEEs & how they help RAG and prompts

Comment on lines 6 to 8
Cube AI exposes language models through a **domain-scoped models registry**.
This endpoint allows clients to discover which models are available for inference
within a specific Cube AI domain.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are domains, and what's domain-scoped model registry?

- domain-scoped permissions
- per-domain model visibility

### Domain Isolation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, what is domain? Workspace terminology resonates better with users in this concept, but regardless, it needs to be explained.

Signed-off-by: Filip Bugarski <[email protected]>
Signed-off-by: Filip Bugarski <[email protected]>
### Example Request

```bash
curl -k https://localhost/proxy/<domain_id>/v1/embeddings \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can test a simple real life case and document it here

Models in Cube AI are used by:

- Chat Completions
- Continue (VS Code integration)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we also support opencode

### Ollama

When using Ollama as a backend, models are referenced by their Ollama identifiers
(e.g. `tinyllama:1.1b`, `starcoder2:3b`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

superadmin can add or remove models

- Model IDs are backend-specific (Ollama / vLLM)
- Models are isolated per domain
- All inference runs inside a Trusted Execution Environment (TEE)
- Models are **domain-scoped**, meaning their visibility and usage are limited
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not correct models are platform wide not specific domains

Signed-off-by: Filip Bugarski <[email protected]>
Signed-off-by: Filip Bugarski <[email protected]>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@drasko drasko merged commit 0e4dbfc into main Jan 21, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants