-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Add Ollama Support with Local Model Discovery and Embeddings #875
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Implement Ollama integration for LLM and embeddings - Add OllamaConfigurationPanel for UI configuration - Add model discovery service for automatic model detection - Update RAG settings to support Ollama embeddings - Add documentation (INFRASTRUCTURE.md, PLAN.md, QUICKSTART.md) - Add E2E tests for Ollama functionality 🤖 Generated with Claude Code Co-Authored-By: Claude <[email protected]>
WalkthroughAdds per-instance Ollama authentication (frontend persistence, UI, and backend propagation), a native Ollama embeddings adapter, auth-aware model discovery and health checks, Supabase infra docs/quickstart, Playwright/e2e and unit tests, Docker restart policies, multi-dimensional search RPCs, and PDF/OCR extraction improvements. Changes
Sequence Diagram(s)sequenceDiagram
actor User
participant UI as Frontend (RAGSettings / OllamaConfig)
participant API as Archon API (ollama_api)
participant Discovery as ModelDiscoveryService
participant Ollama as Ollama Instance
User->>UI: Configure Ollama URL + useAuth + authToken
UI->>API: Save ragSettings (includes per-instance tokens)
User->>UI: Trigger health check / discover models
UI->>API: GET /health or /discover with instance URLs
API->>API: Normalize URLs, map tokens from rag_strategy
API->>Discovery: check_instance_health/discover_models(url, auth_token)
Discovery->>Ollama: HTTP request (Authorization: Bearer token if provided)
Ollama-->>Discovery: health/models response
Discovery-->>API: aggregated result
API-->>UI: health/discovery result
UI-->>User: display status and available models
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
…if block 🤖 Generated with Claude Code Co-Authored-By: Claude <[email protected]>
- Add test for Ollama with auth token - Add test for Ollama without auth token (local setup) - Update fallback test to expect 'required-but-ignored' placeholder - Add RAG settings mock to multi-provider sequence test 🤖 Generated with Claude Code Co-Authored-By: Claude <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
archon-ui-main/src/services/credentialsService.ts (1)
495-547: Critical: Authentication tokens are not persisted to the database.The
authTokenfield is added to theOllamaInstanceinterface (line 68) and collected in the UI, but it is never saved in thesetOllamaInstancesmethod. Lines 507-528 define thefieldsobject that gets persisted, andauthTokenis not included.Impact: Users can enter authentication tokens in the UI, but these tokens will be silently discarded and not saved to the database. This breaks the per-instance authentication feature entirely.
Fix: Add
authTokento the persisted fields:const fields: Record<string, any> = { name: instance.name, baseUrl: instance.baseUrl, isEnabled: instance.isEnabled, isPrimary: instance.isPrimary, instanceType: instance.instanceType || 'both', loadBalancingWeight: instance.loadBalancingWeight || 100 }; + // Add optional authToken if present + if (instance.authToken) { + fields.authToken = instance.authToken; + } + // Add optional health-related fields if (instance.isHealthy !== undefined) {Also update the loading logic in
getOllamaInstances(around line 465) to handle theauthTokenfield when reconstructing instances from credentials.python/src/server/services/llm_provider_service.py (1)
1211-1245: Ollama instance validation doesn’t pass auth_token to health check
validate_provider_instancestill calls:health_status = await model_discovery_service.check_instance_health(instance_url)with no
auth_token, whilecheck_instance_healthnow supports token-based auth and/api/ollama/instances/healthalready uses it. For instances with authentication enabled,/api/ollama/validatewill incorrectly report them as unavailable.Consider resolving an auth token the same way as in
health_check_endpointand passing it through:from .credential_service import credential_service rag_settings = await credential_service.get_credentials_by_category("rag_strategy") llm_base_url = (rag_settings.get("LLM_BASE_URL") or "").replace("/v1", "").rstrip("/") embedding_base_url = (rag_settings.get("OLLAMA_EMBEDDING_URL") or "").replace("/v1", "").rstrip("/") chat_auth_token = rag_settings.get("OLLAMA_CHAT_AUTH_TOKEN") or "" embedding_auth_token = rag_settings.get("OLLAMA_EMBEDDING_AUTH_TOKEN") or "" normalized = instance_url.replace("/v1", "").rstrip("/") auth_token = None if normalized == llm_base_url: auth_token = chat_auth_token or None elif normalized == embedding_base_url: auth_token = embedding_auth_token or None health_status = await model_discovery_service.check_instance_health(instance_url, auth_token=auth_token)(ideally sharing a helper with the API route to avoid duplication).
archon-ui-main/src/components/settings/RAGSettings.tsx (2)
786-805: Resetting instance configs dropsuseAuth/authTokenfieldsIn several places the instance config state is reset or auto‑populated without including the new auth fields:
handleDeleteLLMInstanceandhandleDeleteEmbeddingInstancereset configs to{ name: '', url: '' }.- The “auto‑populate embedding instance if it’s empty” path in the LLM URL change handler constructs an embedding config with just
nameandurl.Given the initial state includes
useAuthandauthToken, this:
- Conflicts with the inferred state type under
strictTypeScript.- Can cause React to flip checkboxes between controlled and uncontrolled (
checked={undefined}).- Silently clears any auth state when users delete or auto‑populate instances.
A simple fix is to always provide the full shape when setting these configs, e.g.:
// LLM delete - setLLMInstanceConfig({ - name: '', - url: '' - }); + setLLMInstanceConfig({ + name: '', + url: '', + useAuth: false, + authToken: '' + }); // Embedding delete - setEmbeddingInstanceConfig({ - name: '', - url: '' - }); + setEmbeddingInstanceConfig({ + name: '', + url: '', + useAuth: false, + authToken: '' + }); // Auto-populate embedding from LLM URL - if (!embeddingInstanceConfig.url || !embeddingInstanceConfig.name) { - setEmbeddingInstanceConfig({ - name: llmInstanceConfig.name || 'Default Ollama', - url: newUrl - }); - } + if (!embeddingInstanceConfig.url || !embeddingInstanceConfig.name) { + setEmbeddingInstanceConfig({ + name: llmInstanceConfig.name || 'Default Ollama', + url: newUrl, + useAuth: embeddingInstanceConfig.useAuth ?? false, + authToken: embeddingInstanceConfig.authToken ?? '', + }); + }This keeps state shape consistent and avoids subtle UI glitches.
Also applies to: 809-827, 2329-2354
1845-1899: Selected model summary usesCHAT_MODEL, which is never written hereThe new summary row shows:
{activeSelection === 'chat' ? ( <span className="text-white"> {ragSettings.CHAT_MODEL || <span className="text-gray-500 italic">Not selected</span>} </span> ) : ( ... )}But throughout this component the chat model is managed via
MODEL_CHOICE:
- Provider buttons set
MODEL_CHOICE.OllamaModelSelectionModal’sonSelectModelupdatesMODEL_CHOICE.getDisplayedChatModelreadsMODEL_CHOICEand the provider to display the active chat model.
CHAT_MODELis defined in the props type but never written here, so the summary will often show “Not selected” even when a chat model is configured.Consider either:
- Reusing
getDisplayedChatModel(ragSettings)in this row, or- Keeping
CHAT_MODELas the canonical key and updating it wheneverMODEL_CHOICEchanges.For example:
- {ragSettings.CHAT_MODEL || <span className="text-gray-500 italic">Not selected</span>} + {getDisplayedChatModel(ragSettings) || <span className="text-gray-500 italic">Not selected</span>}The authentication status row itself looks good and does not leak token values.
🧹 Nitpick comments (12)
QUICKSTART.md (1)
1-47: LGTM! Clear and concise quickstart guide.The quickstart guide provides a focused, step-by-step workflow for local development setup. The structure is clean and the instructions are easy to follow.
Optional: For improved markdown formatting, consider wrapping bare URLs in angle brackets per markdown best practices (lines 25-28):
-| **Archon UI** | http://localhost:3737 | Hauptanwendung | +| **Archon UI** | <http://localhost:3737> | Hauptanwendung |This is a minor style preference and does not affect functionality.
python/src/server/api_routes/ollama_api.py (1)
83-142: Align all Ollama health/model flows with auth-token supportYou’ve wired per-instance auth tokens into
/modelsand/instances/health, but other Ollama flows in this module still call discovery/health without tokens:
/validate→validate_provider_instance("ollama", instance_url)(service currently callscheck_instance_healthwith noauth_token)./models/discover-and-storeand/models/discover-with-detailsstill hit Ollama without any Authorization header.For auth-protected instances these will now systematically report failures even though
/instances/healthcan succeed with tokens.Recommend:
- Updating
validate_provider_instance(inllm_provider_service.py) to resolve anauth_tokenexactly the wayhealth_check_endpointdoes and pass it through tocheck_instance_health.- Planning a follow-up to add token support to the
discover-and-store/discover-with-detailsendpoints as well, or explicitly document that they currently require unauthenticated instances.Also applies to: 166-217, 262-305
archon-ui-main/tests/e2e/ollama-api-mode.spec.ts (1)
1-191: Reduce hard-coded timeouts and brittle selectors in Playwright testsThe flows are clear, but they’re quite fragile:
- Almost every step uses
waitForTimeout(...)instead of waiting on concrete UI conditions (waitForSelector,expect(...).toBeVisible()/toBeEnabled()). This slows tests and risks flakiness.- Many selectors are generic text queries like
text=Embeddingorbutton:has-text("Config"), which can easily break on copy changes or multiple matches.- Navigation to Settings → RAG Settings → Embedding → Ollama Config is duplicated across tests.
Suggested improvements (can be incremental):
- Replace sleeps with condition-based waits around key elements.
- Prefer role-/test-id–based locators (
getByRole,getByTestId) over raw text where possible.- Extract the repeated “open Ollama config in RAG Settings” flow into a helper to DRY the suite.
python/src/server/services/embeddings/embedding_service.py (2)
107-172: NativeOllamaEmbeddingAdapter looks correct but has unbounded per-batch concurrencyThe adapter correctly targets
/api/embeddingswith{ "model": ..., "prompt": ... }and wraps HTTP/JSON failures inEmbeddingAPIError, which is consistent with the rest of the service.Two things to consider:
asyncio.gather(*...)will fire one HTTP request per text concurrently for each batch. With largeEMBEDDING_BATCH_SIZEvalues this could mean dozens or hundreds of parallel requests to a single Ollama instance. You may want to cap per-batch concurrency via a semaphore or chunking to avoid overloading local/remote instances.- The
dimensionsargument is currently ignored for native Ollama. That’s fine (Ollama controls embedding size), but it may be worth documenting in the docstring or via a comment to avoid confusion later.
287-303: Adapter selection for Ollama depends onapi_modestring value – normalize it
_get_embedding_adapterchecks:if provider_name == "ollama" and base_url: if (api_mode or "native") == "native": ...If
OLLAMA_API_MODEever comes through as"Native"/"NATIVE"or with whitespace, this will silently fall into the OpenAI-compatible branch even though the intent is “native”.Given you already read/derive
ollama_api_modeabove, consider normalizing it once (e.g.,.lower().strip()) before passing into_get_embedding_adapter, and/or do the normalization inside_get_embedding_adapter:- if (api_mode or "native") == "native": + mode = (api_mode or "native").strip().lower() + if mode == "native":This makes the behavior robust to minor config variations.
python/src/server/services/ollama/model_discovery_service.py (3)
122-133: Auth token support for/api/tagsis correct but not propagated to detail/capability callsThe new
auth_tokenparameter is correctly wired into the/api/tagsdiscovery call and keeps behavior unchanged when no token is provided. However, follow‑up calls such as_get_model_details()and the capability tests still hit Ollama endpoints without any auth header. On protected instances this will silently 401 and you’ll lose detailed metadata and accurate capability detection, even though listing itself succeeded with a token.If you want full parity for authenticated instances, consider threading
auth_tokenthrough the enrichment pipeline, e.g.:- async def discover_models(self, instance_url: str, fetch_details: bool = False, auth_token: str | None = None) -> list[OllamaModel]: + async def discover_models( + self, + instance_url: str, + fetch_details: bool = False, + auth_token: str | None = None, + ) -> list[OllamaModel]: @@ - enriched_models = await self._enrich_model_capabilities(models, instance_url, fetch_details=fetch_details) + enriched_models = await self._enrich_model_capabilities( + models, + instance_url, + fetch_details=fetch_details, + auth_token=auth_token, + )and then update
_enrich_model_capabilities,_get_model_details, and any capability helpers that hit Ollama directly to accept and forwardauth_tokenin their HTTP headers.This keeps non‑auth flows identical while making “fetch_details=True” actually usable with locked‑down Ollama instances.
Also applies to: 192-199
964-1004: Health check auth header wiring looks good; cache key ignores token changesPassing
auth_tokenand conditionally adding aBearerheader here is straightforward and matches the new health‑check API behavior. One nuance: thehealth_cachekey is onlyhealth_{instance_url}, so a failure without a token and a success with a token (or vice versa) will be cached identically for 30 seconds.If you expect users to toggle/authenticate instances and immediately rely on the health UI, consider including the auth presence in the cache key, e.g.
health_{instance_url}_{bool(auth_token)}or similar, so that “with token” and “without token” statuses don’t mask each other in the short term.
1026-1060: Per-instance auth_tokens map depends on exact URL string matchingThe
auth_tokensdict is plumbed correctly intodiscover_models_from_multiple_instances, and usingauth_tokens.get(url)is nice and simple. This does assume that the keys inauth_tokensare in the same normalized form as theinstance_urlspassed here (e.g. either both with/v1stripped and no trailing slash, or both unnormalized).If the router that constructs
auth_tokensis normalizing URLs (as in the health‑check endpoint), it would be safer to normalizeurlbefore doingauth_tokens.get(...):- tasks = [ - self.discover_models(url, fetch_details=fetch_details, auth_token=auth_tokens.get(url)) - for url in instance_urls - ] + tasks = [ + self.discover_models( + url, + fetch_details=fetch_details, + auth_token=auth_tokens.get(url.rstrip("/").replace("/v1", "")), + ) + for url in instance_urls + ]so a mismatch in
/v1or trailing slashes doesn’t quietly drop the token.archon-ui-main/src/components/settings/RAGSettings.tsx (4)
208-221: Instance config + token sync logic is solid; consider typing RAG token fieldsThe new
llmInstanceConfig/embeddingInstanceConfigshapes and theuseEffectblocks that sync fromLLM_BASE_URL/OLLAMA_*fields into these configs look logically correct:
- Refs (
lastLLMConfigRef/lastEmbeddingConfigRef) prevent infinite loops.useAuthis derived from token presence andauthTokenis updated only when the incoming value changes.The only minor concern is the reliance on
(ragSettings as any).OLLAMA_*_AUTH_TOKENto satisfy TypeScript. Since these keys are now part of your configuration surface, it would be cleaner to add them toRAGSettingsProps["ragSettings"]explicitly:ragSettings: { MODEL_CHOICE: string; CHAT_MODEL?: string; + OLLAMA_CHAT_AUTH_TOKEN?: string; + OLLAMA_EMBEDDING_AUTH_TOKEN?: string; @@ OLLAMA_EMBEDDING_URL?: string; OLLAMA_EMBEDDING_INSTANCE_NAME?: string;so you can drop the
as anycasts and keep strict typing end‑to‑end.Also applies to: 224-271
570-583: Model discovery fetches work but bypass TanStack Query and may over-fetch
ollamaMetricsnow tracksllmAvailableModels/embeddingAvailableModels, andloadAvailableModels()hits/api/ollama/modelsboth:
- Automatically when the instance URLs change, and
- After a successful
manualTestConnectionwhen acontextis provided.This is functionally fine and the error handling is intentionally soft, but two points:
Project guidelines say “Use TanStack Query for all data fetching”. These new fetches bypass that layer, so if you already have React Query plumbing for Ollama endpoints, consider converting
loadAvailableModelsinto a query (withenabledflags keyed on URLs) instead of a bespokefetch. That would centralize retries, caching, and error behavior.The URL‑change effects fire regardless of the active provider; if a user temporarily switches away from Ollama, you still ping
/api/ollama/modelswhenever the URL state changes. You may want to guard them:- useEffect(() => { - if (llmInstanceConfig.url && llmInstanceConfig.url !== DEFAULT_OLLAMA_URL) { + useEffect(() => { + if ( + chatProvider === 'ollama' && + llmInstanceConfig.url && + llmInstanceConfig.url !== DEFAULT_OLLAMA_URL + ) { loadAvailableModels(llmInstanceConfig.url, 'chat'); } - }, [llmInstanceConfig.url]); + }, [chatProvider, llmInstanceConfig.url]);Same idea for the embedding instance. This reduces unnecessary calls when Ollama isn’t currently in use.
Also applies to: 640-691, 751-755
1595-1652: OLLAMA_API_MODE UI is clear; consider typing the setting keyThe API mode toggle between “Native Ollama API” and “OpenAI‑Compatible” is well‑designed and the defaulting logic (
nativewhen unset) is clear. The only gap is thatOLLAMA_API_MODEisn’t declared inRAGSettingsProps["ragSettings"], so allsetRagSettings({ ...ragSettings, OLLAMA_API_MODE: ... })calls rely on structural typing and thesetRagSettings: anyescape hatch.For better static safety, add:
ragSettings: { @@ OLLAMA_EMBEDDING_INSTANCE_NAME?: string; + OLLAMA_API_MODE?: 'native' | 'openai-compatible';so backend and frontend stay in sync as this option evolves.
2351-2387: Auth settings in modals are wired correctly; consider clearing tokens on deleteThe new auth sections in the LLM and Embedding edit modals behave as expected:
useAuthtoggles a password input.- Tokens are stored in
llmInstanceConfig.authToken/embeddingInstanceConfig.authToken.- Saving the modal pushes tokens into
ragSettingsunderOLLAMA_CHAT_AUTH_TOKENandOLLAMA_EMBEDDING_AUTH_TOKEN.Two small follow‑ups:
When deleting an instance (LLM or Embedding), you clear
LLM_BASE_URL/OLLAMA_EMBEDDING_URLbut leave the correspondingOLLAMA_*_AUTH_TOKENinragSettings. It might be more intuitive (and safer) to also blank those keys so “delete instance” truly removes all associated configuration.The modal “Save Changes” handlers currently only update local
ragSettings; persistence happens when the user later clicks the top‑level “Save Settings” button. That matches existing behavior for other fields, but the success toast (“instance updated successfully”) might be interpreted as “persisted to server”. If you’ve had user confusion around this, consider callingcredentialsService.updateRagSettings(updatedSettings)here as well.Functionally, the core token wiring looks good and doesn’t expose secrets in the UI.
Also applies to: 2400-2405, 2446-2477, 2490-2495
| def _get_embedding_adapter( | ||
| provider: str, client: Any, base_url: str | None = None, auth_token: str | None = None, api_mode: str | None = None | ||
| ) -> EmbeddingProviderAdapter: | ||
| provider_name = (provider or "").lower() | ||
| if provider_name == "google": | ||
| search_logger.info("Using Google native embedding adapter") | ||
| return GoogleEmbeddingAdapter() | ||
| if provider_name == "ollama" and base_url: | ||
| # Check API mode - default to native if not specified or if set to native | ||
| if (api_mode or "native") == "native": | ||
| search_logger.info(f"Using Ollama native API adapter with base URL: {base_url}") | ||
| return NativeOllamaEmbeddingAdapter(base_url, auth_token) | ||
| else: | ||
| search_logger.info(f"Using OpenAI-compatible adapter for Ollama with base URL: {getattr(client, 'base_url', 'N/A')}") | ||
| return OpenAICompatibleEmbeddingAdapter(client) | ||
| search_logger.info(f"Using OpenAI-compatible embedding adapter for provider: {provider_name}") | ||
| return OpenAICompatibleEmbeddingAdapter(client) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle missing OLLAMA_EMBEDDING_URL more cleanly and avoid unnecessary Ollama config work for other providers
In create_embeddings_batch you do:
ollama_base_url = rag_settings.get("OLLAMA_EMBEDDING_URL", "").rstrip("/v1").rstrip("/")
ollama_auth_token = rag_settings.get("OLLAMA_EMBEDDING_AUTH_TOKEN", "")
ollama_api_mode = rag_settings.get("OLLAMA_API_MODE", "native")If OLLAMA_EMBEDDING_URL is present but set to None, the .rstrip call will raise and force the except path, even though this is just “no embedding URL configured”. You also compute these Ollama-specific values even when embedding_provider is not "ollama".
You could make this more robust and targeted as:
- ollama_base_url = rag_settings.get("OLLAMA_EMBEDDING_URL", "").rstrip("/v1").rstrip("/")
- ollama_auth_token = rag_settings.get("OLLAMA_EMBEDDING_AUTH_TOKEN", "")
- ollama_api_mode = rag_settings.get("OLLAMA_API_MODE", "native")
+ ollama_base_url_raw = (rag_settings.get("OLLAMA_EMBEDDING_URL") or "").strip()
+ ollama_base_url = ollama_base_url_raw.rstrip("/v1").rstrip("/") if ollama_base_url_raw else ""
+ ollama_auth_token = rag_settings.get("OLLAMA_EMBEDDING_AUTH_TOKEN", "") or ""
+ ollama_api_mode = (rag_settings.get("OLLAMA_API_MODE") or "native").lower()and only pass ollama_base_url/ollama_auth_token into _get_embedding_adapter when embedding_provider.lower() == "ollama".
This avoids noisy warnings from benign None values and keeps non-Ollama providers from incurring extra config work. Normalizing ollama_api_mode to lowercase also makes _get_embedding_adapter resilient to casing differences.
Also applies to: 433-469
🤖 Prompt for AI Agents
python/src/server/services/embeddings/embedding_service.py around lines 287-303
(also apply same change to 433-469): the code currently computes
ollama_base_url/auth_token/api_mode unconditionally and calls .rstrip on a value
that may be None, which can raise and causes unnecessary Ollama handling for
other providers; change the call sites to only compute and sanitize
OLLAMA_EMBEDDING_URL/AUTH_TOKEN/API_MODE when embedding_provider.lower() ==
"ollama" (guard against None before calling .rstrip), normalize api_mode to
lowercase, and only pass ollama_base_url and ollama_auth_token into
_get_embedding_adapter when the provider is Ollama so non-Ollama providers avoid
extra config work and you won't call .rstrip on None.
🧩 Analysis chain
Ensure OLLAMA_*_AUTH_TOKEN values from rag_settings are actual tokens, not “[ENCRYPTED]” placeholders
Both here and in llm_provider_service.get_llm_client you read:
ollama_auth_token = rag_settings.get("OLLAMA_EMBEDDING_AUTH_TOKEN", "")
# or
ollama_auth_token = rag_settings.get("OLLAMA_CHAT_AUTH_TOKEN", "")The provided credential_service.get_credentials_by_category("rag_strategy") implementation returns a dict with {"value": "[ENCRYPTED]", "is_encrypted": True, ...} for encrypted keys, not the real secret value. If the Ollama auth tokens are stored as encrypted rag_strategy entries, these lookups will hand [ENCRYPTED] placeholders (or dicts) to the HTTP layer, and auth will consistently fail.
Please double-check:
- How OLLAMA_*_AUTH_TOKEN is stored (encrypted vs plaintext) in
archon_settingsfor therag_strategycategory. - Whether there is a backend-only path for retrieving decrypted values (while still redacting them in the UI) and use that instead of the generic
get_credentials_by_category.
If tokens are currently encrypted and only exposed through [ENCRYPTED] placeholders, this retrieval logic will need to be adjusted for Ollama auth to actually work.
Also applies to: 433-469
I'll verify the review comment by examining the codebase to understand how OLLAMA auth tokens are retrieved and whether they may be "[ENCRYPTED]" placeholders.
---
🏁 Script executed:
```shell
#!/bin/bash
# Locate key files
fd embedding_service.py
fd credential_service.py
fd llm_provider_service.py
Length of output: 444
🏁 Script executed:
#!/bin/bash
# Examine the embedding_service.py lines around 287-303 and 433-469
cat python/src/server/services/embeddings/embedding_service.py | sed -n '280,310p'
cat python/src/server/services/embeddings/embedding_service.py | sed -n '430,475p'Length of output: 4073
🏁 Script executed:
#!/bin/bash
# Find how rag_settings are obtained and where OLLAMA_EMBEDDING_AUTH_TOKEN is used
rg -n "OLLAMA_EMBEDDING_AUTH_TOKEN\|OLLAMA_CHAT_AUTH_TOKEN" --type py -B 2 -A 2Length of output: 0
🏁 Script executed:
#!/bin/bash
# Find the credential_service implementation
fd credential_service.py -type fLength of output: 170
🏁 Script executed:
#!/bin/bash
# Search for get_credentials_by_category and how it returns encrypted values
rg -n "get_credentials_by_category\|ENCRYPTED" --type py -B 2 -A 2Length of output: 0
---
🏁 Script executed:
```shell
#!/bin/bash
# Examine credential_service.py to see get_credentials_by_category implementation
cat python/src/server/services/credential_service.py
Length of output: 25935
🏁 Script executed:
#!/bin/bash
# Search for OLLAMA token references using simpler patterns
rg "OLLAMA_EMBEDDING_AUTH_TOKEN" --type py -B 3 -A 3
rg "OLLAMA_CHAT_AUTH_TOKEN" --type py -B 3 -A 3Length of output: 7033
🏁 Script executed:
#!/bin/bash
# Examine llm_provider_service.py to find the get_llm_client function and token usage
cat python/src/server/services/llm_provider_service.py | head -500Length of output: 20625
🏁 Script executed:
#!/bin/bash
# Search for ENCRYPTED placeholder or decryption logic
rg "ENCRYPTED" --type py -B 2 -A 2
rg "decrypt\|is_encrypted" --type py -B 2 -A 2Length of output: 1368
---
🏁 Script executed:
```shell
#!/bin/bash
# Check ollama_api.py to see if it has the same issue with auth tokens
cat python/src/server/api_routes/ollama_api.py | grep -n -B 5 -A 5 "OLLAMA_EMBEDDING_AUTH_TOKEN\|OLLAMA_CHAT_AUTH_TOKEN"
Length of output: 1237
🏁 Script executed:
#!/bin/bash
# Verify if OLLAMA auth tokens are actually stored as encrypted in the database schema or init
rg "OLLAMA_EMBEDDING_AUTH_TOKEN\|OLLAMA_CHAT_AUTH_TOKEN" --type py -B 5 -A 5 | grep -i "encrypt\|set_credential"Length of output: 0
🏁 Script executed:
#!/bin/bash
# Check how settings are initialized - are auth tokens marked encrypted?
fd "init\|seed\|migration" --type py | head -20
cat python/src/server/services/settings_service.py 2>/dev/null || echo "File not found"Length of output: 184
Replace category-based credential retrieval with backend-only decryption for Ollama auth tokens
The identified issue is confirmed: credential_service.get_credentials_by_category("rag_strategy") returns {"value": "[ENCRYPTED]", "is_encrypted": True} for encrypted entries, causing the literal string "[ENCRYPTED]" to be passed as auth tokens instead of actual decrypted values.
This affects multiple locations:
- embedding_service.py (line ~453):
ollama_auth_token = rag_settings.get("OLLAMA_EMBEDDING_AUTH_TOKEN", "") - llm_provider_service.py (lines ~290, ~340): Similar pattern for both
OLLAMA_EMBEDDING_AUTH_TOKENandOLLAMA_CHAT_AUTH_TOKEN - ollama_api.py (lines ~107, ~192): Same problematic pattern
Replace with backend-only credential retrieval:
ollama_auth_token = await credential_service.get_credential("OLLAMA_EMBEDDING_AUTH_TOKEN", default="", decrypt=True)The get_credential(key, decrypt=True) method already exists in credential_service and properly handles decryption while keeping UI redaction intact.
🤖 Prompt for AI Agents
python/src/server/services/embeddings/embedding_service.py around lines 287-303,
the code currently pulls Ollama auth tokens from settings/credential categories
which can yield the literal "[ENCRYPTED]" string; replace those lookups with a
backend-only decrypted retrieval by calling
credential_service.get_credential("OLLAMA_EMBEDDING_AUTH_TOKEN", default="",
decrypt=True) (awaiting it if async) and use that returned value when
constructing Ollama adapters so the actual decrypted token is passed instead of
the encrypted placeholder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (5)
python/src/agent_work_orders/utils/state_reconciliation.py (1)
159-161: Fix Windows root directory detection logic.The check
len(path_obj.parts) == 2 and path_obj.parts[1] == ""will never match Windows drive roots. On Windows,Path("C:\\").partsreturns("C:\\",)(1 element, not 2). This allows dangerous paths like "C:" to bypass the root directory safety check.Apply this diff to correctly detect Windows drive roots:
- is_not_root = not ( - orphan_path_resolved in ("/", "\\") or - (os.name == "nt" and len(path_obj.parts) == 2 and path_obj.parts[1] == "") - ) + is_not_root = not ( + orphan_path_resolved in ("/", "\\") or + (os.name == "nt" and len(path_obj.parts) == 1) + )archon-ui-main/src/services/credentialsService.ts (1)
62-75:OllamaInstance.authTokenis never persisted or reloaded, breaking per-instance authVerification confirms the issue. In
getOllamaInstances(lines 437-485), the credential parsing correctly extracts all fields intoinstanceMap, but the final instance object construction (lines 477-487) explicitly omitsauthTokenfrom the returned array. Similarly, insetOllamaInstances(lines 495-543), thefieldsmap excludesauthTokenentirely, so any token set in the UI is never persisted to credentials storage.To fix this:
- Include
authTokenin the returned instance object (line ~487):instances.push({ id: instance.id, name: instance.name, baseUrl: instance.baseUrl, isEnabled: instance.isEnabled ?? true, isPrimary: instance.isPrimary ?? false, instanceType: instance.instanceType ?? 'both', loadBalancingWeight: instance.loadBalancingWeight ?? 100, isHealthy: instance.isHealthy, responseTimeMs: instance.responseTimeMs, modelsAvailable: instance.modelsAvailable, lastHealthCheck: instance.lastHealthCheck, + authToken: instance.authToken, });
- Add
authTokento the persisted fields insetOllamaInstances(after line 514):if (instance.lastHealthCheck) { fields.lastHealthCheck = instance.lastHealthCheck; } + +if (instance.authToken) { + fields.authToken = instance.authToken; +}Alternatively, if tokens should live only in global credentials (e.g.,
OLLAMA_CHAT_AUTH_TOKEN), removeauthTokenfrom theOllamaInstanceinterface entirely to avoid the false expectation of per-instance storage.archon-ui-main/src/components/settings/RAGSettings.tsx (3)
2346-2354: Sync authentication settings when using same host for both instances.When the "Use same host for embedding instance" checkbox is enabled, only the
urlandnamefields are synced. If the LLM instance has authentication enabled, the embedding instance should inherit those settings to avoid configuration mismatch.Apply this diff:
onChange={(e) => { if (e.target.checked) { // Sync embedding instance with LLM instance setEmbeddingInstanceConfig({ name: llmInstanceConfig.name || 'Default Ollama', url: llmInstanceConfig.url, - useAuth: embeddingInstanceConfig.useAuth, - authToken: embeddingInstanceConfig.authToken + useAuth: llmInstanceConfig.useAuth, + authToken: llmInstanceConfig.authToken }); } }}
787-828: Clear authentication tokens when deleting instance configurations.The instance deletion handlers reset the instance config objects but don't remove the corresponding auth token keys from
ragSettings. This leaves orphaned authentication data.Apply these diffs:
const handleDeleteLLMInstance = () => { if (window.confirm('Are you sure you want to delete the current LLM instance configuration?')) { // Reset LLM instance configuration setLLMInstanceConfig({ name: '', - url: '' + url: '', + useAuth: false, + authToken: '' }); // Clear related RAG settings const updatedSettings = { ...ragSettings }; delete updatedSettings.LLM_BASE_URL; + delete updatedSettings.LLM_INSTANCE_NAME; delete updatedSettings.MODEL_CHOICE; + delete updatedSettings.OLLAMA_CHAT_AUTH_TOKEN; setRagSettings(updatedSettings);Apply similar changes to
handleDeleteEmbeddingInstance(lines 809-828).
1845-1899: Frontend will display "Not selected" for CHAT_MODEL due to field name mismatch.The concern in the original review is valid. Backend stores the Ollama chat model selection as
OLLAMA_CHAT_MODELin the database, but frontend line 1892 attempts to displayragSettings.CHAT_MODEL, which doesn't exist.Backend flow: Database stores
OLLAMA_CHAT_MODEL→credential_service.get_credentials_by_category("rag_strategy")returns raw dict withOLLAMA_CHAT_MODELkey → Frontend receives dict withoutCHAT_MODELfield.Fix required: Either update the backend to map
OLLAMA_CHAT_MODELtoCHAT_MODELin the response, or update frontend line 1892 to displayragSettings.OLLAMA_CHAT_MODELinstead.
♻️ Duplicate comments (6)
archon-ui-main/test-results/.last-run.json (1)
1-6: Do not commit.last-run.json; ignore test-results in VCS.This file is a transient Playwright/Vitest artifact and will churn frequently. Please:
- Add
test-results/(or at leasttest-results/.last-run.json) toarchon-ui-main/.gitignore.- Remove the tracked file via
git rm --cached archon-ui-main/test-results/.last-run.json.This matches the earlier review feedback.
INFRASTRUCTURE.md (1)
108-121: Remove hard-coded Supabase JWT from docs and use a placeholder instead.The
SUPABASE_SERVICE_KEYhere is a real JWT with a long expiry. Even if intended for local use, committing it is a credential leak and has already been flagged by gitleaks and a prior review.Recommend:
-# JWT Service Role Key (generiert mit lokalem JWT_SECRET) -SUPABASE_SERVICE_KEY=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJyb2xlIjoic2VydmljZV9yb2xlIiwiaXNzIjoic3VwYWJhc2UiLCJpYXQiOjE3NjM2NTg1MDYsImV4cCI6MjA3OTAxODUwNn0.HVH5TgwW70JZtiGdnjU4RGexDDVbGnI3mXt-diQhVy8 +# JWT Service Role Key (mit lokalem JWT_SECRET generieren – nicht committen) +SUPABASE_SERVICE_KEY=<generate-using-script-below>…and explicitly instruct readers to run the Python script in this doc to generate their own key locally, add it to their
.env, and never commit it.python/src/server/services/embeddings/embedding_service.py (1)
447-473: Ollama embedding config still fragile and likely passing “[ENCRYPTED]” placeholders instead of real auth tokensThis block reintroduces both issues previously flagged in earlier reviews:
rag_settings = await _maybe_await( credential_service.get_credentials_by_category("rag_strategy") ) ... ollama_base_url = rag_settings.get("OLLAMA_EMBEDDING_URL", "").rstrip("/v1").rstrip("/") ollama_auth_token = rag_settings.get("OLLAMA_EMBEDDING_AUTH_TOKEN", "") ollama_api_mode = rag_settings.get("OLLAMA_API_MODE", "native") ... adapter = _get_embedding_adapter( embedding_provider, client, base_url=ollama_base_url, auth_token=ollama_auth_token, api_mode=ollama_api_mode, )
.rstripon possibly non‑string values:get_credentials_by_category("rag_strategy")returns objects like{"value": "[ENCRYPTED]", "is_encrypted": True, ...}for encrypted rows. IfOLLAMA_EMBEDDING_URLor related keys are missing or malformed, you can end up calling.rstriponNoneor on a dict, which will raise and force the generic “Failed to load embedding settings” path even though this is just misconfiguration. Guard and normalize the raw value before stripping.Auth token is almost certainly the
[ENCRYPTED]placeholder, not the decrypted secret: Becauseget_credentials_by_categoryintentionally returns masked values for encrypted entries,OLLAMA_EMBEDDING_AUTH_TOKENhere will be either a placeholder dict or the literal string"[ENCRYPTED]". That value is then used as the Bearer token inNativeOllamaEmbeddingAdapter, so auth will consistently fail even when a valid token is stored.To address both, you can:
- Only compute Ollama‑specific settings when
embedding_provider.lower() == "ollama".- Pull the real token via the backend‑only decryption API instead of from the category map.
- Normalize the base URL defensively.
For example:
- rag_settings = await _maybe_await( - credential_service.get_credentials_by_category("rag_strategy") - ) - batch_size = int(rag_settings.get("EMBEDDING_BATCH_SIZE", "100")) - embedding_dimensions = int(rag_settings.get("EMBEDDING_DIMENSIONS", "1536")) - - # For Ollama, get native API URL, auth token, and API mode - ollama_base_url = rag_settings.get("OLLAMA_EMBEDDING_URL", "").rstrip("/v1").rstrip("/") - ollama_auth_token = rag_settings.get("OLLAMA_EMBEDDING_AUTH_TOKEN", "") - ollama_api_mode = rag_settings.get("OLLAMA_API_MODE", "native") + rag_settings = await _maybe_await( + credential_service.get_credentials_by_category("rag_strategy") + ) + batch_size = int(rag_settings.get("EMBEDDING_BATCH_SIZE", "100")) + embedding_dimensions = int(rag_settings.get("EMBEDDING_DIMENSIONS", "1536")) + + ollama_base_url = "" + ollama_auth_token = "" + ollama_api_mode = "native" + + if (embedding_provider or "").lower() == "ollama": + raw_url = (rag_settings.get("OLLAMA_EMBEDDING_URL") or "").strip() + if isinstance(raw_url, str) and raw_url: + ollama_base_url = raw_url.rstrip("/v1").rstrip("/") + + # Use backend-only decrypted credential for the token + try: + ollama_auth_token = await credential_service.get_credential( + "OLLAMA_EMBEDDING_AUTH_TOKEN", + default="", + decrypt=True, + ) + except TypeError: + # Fallback if get_credential signature differs; keep empty token + pass + + mode_raw = rag_settings.get("OLLAMA_API_MODE") or "native" + if isinstance(mode_raw, str): + ollama_api_mode = mode_raw.strip().lower()and keep the default values in the
exceptblock as you already do.This avoids noisy errors from benign misconfig, and—more importantly—ensures the adapter receives a real decrypted Bearer token instead of masked placeholders.
python/src/server/api_routes/ollama_api.py (2)
99-131: [Duplicate] Normalize instance URLs consistently when mapping Ollama auth tokensThis issue has already been flagged in a previous review. The configured base URLs strip
/v1and then trailing/, but incoming instance URLs only strip trailing/. When callers pass URLs with/v1(common from UI), the equality checks fail and no auth token is attached, causing 401 errors.
180-187: [Duplicate] Do not log full rag_settings – it may contain secretsThis issue has already been flagged in a previous review. Logging the entire
rag_settingsdictionary exposes sensitive data including Ollama auth tokens and other credentials. This is a security/compliance risk.PLAN.md (1)
253-261: [Duplicate] Remove real auth tokens from documentation and rotate themThis critical issue has already been flagged in a previous review. The DB snapshot contains what appears to be a real Ollama auth token (
ollama_13107e338aa16a6a8295592ce050f6cb) that should be immediately rotated and replaced with a placeholder.
🧹 Nitpick comments (12)
python/src/agent_work_orders/utils/state_reconciliation.py (1)
184-186: Catch specific exception types where possible.Catching bare
Exceptionduring file operations masks specific errors. Consider catchingOSErrorand its subclasses (e.g.,PermissionError,FileNotFoundError) for more precise diagnostics, while allowing unexpected errors to propagate.- except Exception as e: + except OSError as e:This improves observability by distinguishing expected file-system errors from unexpected runtime failures, making it easier to diagnose issues during reconciliation.
archon-ui-main/package.json (1)
64-67: Playwright devDependency looks good; consider adding an e2e script.Adding
@playwright/testas a devDependency is appropriate for the new e2e tests. Optionally, add a"test:e2e": "playwright test"script so CI and devs have a clear entrypoint for those tests.QUICKSTART.md (1)
7-19: Replace machine-specific absolute paths with repo-relative or placeholder paths.The commands hard-code
/Volumes/DATEN/Coding/archon/..., which is specific to one machine and will not work for most contributors. Consider using repo-relative paths or a placeholder like<ARCHON_REPO_ROOT>instead, for example:- cd /Volumes/DATEN/Coding/archon/supabase + cd path/to/archon/supabase - cd /Volumes/DATEN/Coding/archon + cd path/to/archonThis keeps the quick-start copy‑pasteable for other environments.
Also applies to: 78-89, 125-128
INFRASTRUCTURE.md (1)
78-89: Generalize hard-coded local filesystem paths for broader reuse.Multiple snippets use absolute paths like
/Volumes/DATEN/Coding/archon/..., which are specific to one machine. To make this infra guide reusable, consider switching to repo-relative or placeholder paths (e.g.,cd path/to/archon,cd path/to/archon/supabase) and clarifying that users should adapt them to their environment.Also applies to: 125-128, 189-191, 345-347, 455-459, 465-472
python/src/server/services/llm_provider_service.py (1)
415-431: Ollama auth-token selection looks correct; consider DRYing it and tightening logging.The new logic to:
- Fetch
rag_strategysettings,- Select
OLLAMA_EMBEDDING_AUTH_TOKENvsOLLAMA_CHAT_AUTH_TOKEN, and- Fall back to
"required-but-ignored"when unset,is consistent between the OpenAI→Ollama fallback and the direct Ollama path, and avoids logging raw tokens (only lengths), which is good.
Two small follow-ups you might consider:
Factor out token resolution
Extract a helper like_get_ollama_auth_token(use_embedding: bool, instance_type: str | None)that encapsulates:
- Cache lookup / fetch of
rag_strategy,- Key selection (
EMBEDDINGvsCHAT),- Defaulting + warning.
That removes duplication between the fallback branch and the main
provider_name == "ollama"branch.Sanitize or downgrade embedding URL logging
logger.info(f"Embedding URL from settings: {embedding_url}")andlogger.info(f"Resolved embedding URL: {final_url}")will log whatever is configured, including potential credentials if someone encodes them in the URL. Using_sanitize_for_log(...)or moving these todebugwould reduce the chance of leaking sensitive connection details into logs.These are polish items; core behavior and error handling look solid.
Also applies to: 455-475, 618-622
python/src/server/services/embeddings/embedding_service.py (1)
107-172: NativeOllamaEmbeddingAdapter behavior is reasonable, but consider concurrency and dimensions semanticsThe adapter’s per-text
asyncio.gatherover the batch and strict validation of theembeddingfield look good. Two small points to keep in mind:
dimensionsis currently ignored; that’s fine given Ollama’s fixed-dimension embeddings, but it may be worth a brief comment so future readers don’t assume down‑projection happens here.- With default
EMBEDDING_BATCH_SIZEyou’ll fire up to that many concurrent/api/embeddingscalls per batch. If users push batch size much higher, you might want an internal concurrency cap or a per‑adapter semaphore to avoid overloading smaller Ollama instances.python/src/server/services/ollama/model_discovery_service.py (1)
122-135: Auth token support is good for/api/tags, but capability/detail calls still ignore itThe new
auth_tokenplumbing fordiscover_models,check_instance_health, anddiscover_models_from_multiple_instanceslooks correct: headers are built as:headers = {} if (auth_token): headers["Authorization"] = f"Bearer {auth_token}"and passed into
client.get(...)for/api/tags.Two follow‑ups to consider:
Capability and details endpoints still unauthenticated. Methods like
_test_embedding_capability_fast,_test_embedding_capability,_get_model_details, and the various chat/structured‑output tests (_test_chat_capability_fast,_test_chat_capability, etc.) still hit/api/embeddings,/api/show, and chat completions without any Authorization header. On an Ollama instance that enforces auth globally, model listing and health will now work, but enrichment and capability detection will likely return 401s and silently fall back to defaults.If you intend “protected instance” support end‑to‑end, you probably want to thread
auth_tokenthrough_enrich_model_capabilitiesand the downstream helpers (or at least_get_model_detailsand both embedding tests) and attach the same header there.Health cache is keyed only by URL.
check_instance_healthcaches byhealth_{instance_url}regardless of token. If a user fixes a bad token and immediately re‑checks health for the same URL, they can get a stale cached error for up to 30s. Not critical, but you might want to either skip caching when a request fails with 401, or include the token (or a hash of it) in the cache key.Both are incremental and keep backward compatibility, but they’ll make the new auth support behave more predictably on locked‑down Ollama deployments.
Also applies to: 192-199, 235-243, 964-999, 1021-1024, 1026-1060
archon-ui-main/test-results/tests-e2e-ollama-api-mode--98290-t-to-Native-Ollama-API-mode/error-context.md (1)
1-638: Large UI snapshot is fine, but confirm it’s meant to live in version controlThis file is purely a captured UI tree for an e2e failure context and doesn’t affect runtime behavior. If it’s generated by tests, consider whether it should be checked in or ignored via
.gitignoreto avoid churn; otherwise, it’s fine as documentation of the failing state.python/src/server/api_routes/ollama_api.py (1)
204-215: Consider reducing log verbosity for token selection.The health check logs token selection decisions at
infolevel for every instance check. This creates verbose logs in production. Consider moving lines 210, 213, and 215 todebuglevel to reduce noise while keeping the information available for troubleshooting.Apply this diff to reduce log verbosity:
if url == llm_base_url and chat_auth_token: auth_token = chat_auth_token - logger.info(f"Using chat auth token for {url}") + logger.debug(f"Using chat auth token for {url}") elif url == embedding_base_url and embedding_auth_token: auth_token = embedding_auth_token - logger.info(f"Using embedding auth token for {url}") + logger.debug(f"Using embedding auth token for {url}") else: - logger.warning(f"No matching auth token found for {url}") + logger.debug(f"No auth token configured for {url}")archon-ui-main/tests/e2e/ollama-api-mode.spec.ts (1)
1-192: Replace hardcoded timeouts with Playwright's auto-waiting mechanisms.The tests use
page.waitForTimeout()extensively (30+ occurrences). Playwright's auto-waiting is more reliable and faster. Most timeouts can be removed since Playwright automatically waits for elements before interacting.Example refactor for lines 9-26:
test('should display API mode radio buttons when Ollama is selected', async ({ page }) => { // Navigate to RAG Settings tab await page.click('text=RAG Settings'); - await page.waitForTimeout(500); // Select Ollama as embedding provider await page.click('text=Embedding'); - await page.waitForTimeout(300); // Click on Ollama provider card const ollamaCard = page.locator('button:has-text("Ollama")').first(); await ollamaCard.click(); - await page.waitForTimeout(500); // Open Ollama configuration const configButton = page.locator('button:has-text("Config")').first(); await configButton.click(); - await page.waitForTimeout(500); // Verify API mode section is visible (auto-waits) await expect(page.locator('text=Ollama API Mode')).toBeVisible();Playwright's
click()andexpect().toBeVisible()already include auto-waiting, making explicit timeouts unnecessary.archon-ui-main/src/components/settings/RAGSettings.tsx (2)
1595-1652: Consider clarifying that API mode is a global setting.The Ollama API Mode selection appears within the context-aware configuration panel (LLM or Embedding), but it's actually a global setting that affects both. Users might expect it to be per-instance.
Consider adding a helper text indicating this applies to all Ollama instances:
<div className="mb-4 pb-4 border-b border-green-500/20"> <label className="block text-sm font-medium text-gray-300 mb-3"> Ollama API Mode </label> + <p className="text-xs text-gray-400 mb-2"> + This setting applies to all Ollama instances (both LLM and Embedding) + </p> <div className="flex gap-3">
2363-2387: Extract authentication UI into a reusable component.The authentication sections in both LLM and Embedding modals are nearly identical (lines 2363-2387 and 2454-2477), violating the DRY principle.
Consider creating a reusable component:
interface AuthenticationSectionProps { id: string; useAuth: boolean; authToken: string; onUseAuthChange: (checked: boolean) => void; onTokenChange: (token: string) => void; accentColor?: 'green' | 'purple'; } const AuthenticationSection: React.FC<AuthenticationSectionProps> = ({ id, useAuth, authToken, onUseAuthChange, onTokenChange, accentColor = 'green' }) => ( <div className="space-y-2 mt-4"> <div className="flex items-center gap-2"> <input type="checkbox" id={id} checked={useAuth} onChange={(e) => onUseAuthChange(e.target.checked)} className={`w-4 h-4 text-${accentColor}-600 ...`} /> <label htmlFor={id} className="text-sm font-medium text-gray-700 dark:text-gray-300"> Use Authentication </label> </div> {useAuth && ( <Input type="password" placeholder="Auth Token" value={authToken} onChange={(e) => onTokenChange(e.target.value)} className="text-sm" /> )} </div> );Then use it in both modals to reduce duplication.
Also applies to: 2454-2477
| interface RAGSettingsProps { | ||
| ragSettings: { | ||
| MODEL_CHOICE: string; | ||
| CHAT_MODEL?: string; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add missing auth token fields to the RAGSettingsProps interface.
The code accesses OLLAMA_CHAT_AUTH_TOKEN and OLLAMA_EMBEDDING_AUTH_TOKEN via type assertion (ragSettings as any), which bypasses TypeScript's type checking and violates the strict mode requirement.
Apply this diff to add the missing fields to the interface:
interface RAGSettingsProps {
ragSettings: {
MODEL_CHOICE: string;
CHAT_MODEL?: string;
USE_CONTEXTUAL_EMBEDDINGS: boolean;
CONTEXTUAL_EMBEDDINGS_MAX_WORKERS: number;
USE_HYBRID_SEARCH: boolean;
USE_AGENTIC_RAG: boolean;
USE_RERANKING: boolean;
LLM_PROVIDER?: string;
LLM_BASE_URL?: string;
LLM_INSTANCE_NAME?: string;
EMBEDDING_MODEL?: string;
EMBEDDING_PROVIDER?: string;
OLLAMA_EMBEDDING_URL?: string;
OLLAMA_EMBEDDING_INSTANCE_NAME?: string;
+ OLLAMA_CHAT_AUTH_TOKEN?: string;
+ OLLAMA_EMBEDDING_AUTH_TOKEN?: string;
+ OLLAMA_API_MODE?: 'native' | 'openai-compatible';
// Crawling Performance Settings
CRAWL_BATCH_SIZE?: number;
// ... rest of fields
};
setRagSettings: (settings: any) => void;
}Then remove the type assertions on lines 230, 248, 253, and 271:
-const newAuthToken = (ragSettings as any).OLLAMA_CHAT_AUTH_TOKEN || '';
+const newAuthToken = ragSettings.OLLAMA_CHAT_AUTH_TOKEN || '';As per coding guidelines.
Also applies to: 230-230, 248-248, 253-253, 271-271
🤖 Prompt for AI Agents
In archon-ui-main/src/components/settings/RAGSettings.tsx around line 144, the
RAGSettingsProps interface is missing OLLAMA_CHAT_AUTH_TOKEN and
OLLAMA_EMBEDDING_AUTH_TOKEN fields; add them (both optional strings) to the
interface definition, then remove the unsafe (ragSettings as any) type
assertions at lines 230, 248, 253 and 271 and instead access those newly typed
properties directly from ragSettings so TypeScript strict mode is satisfied.
| useEffect(() => { | ||
| if (llmInstanceConfig.url && llmInstanceConfig.url !== 'http://host.docker.internal:11434/v1') { | ||
| loadAvailableModels(llmInstanceConfig.url, 'chat'); | ||
| } | ||
| }, [llmInstanceConfig.url]); | ||
|
|
||
| useEffect(() => { | ||
| if (embeddingInstanceConfig.url && embeddingInstanceConfig.url !== 'http://host.docker.internal:11434/v1') { | ||
| loadAvailableModels(embeddingInstanceConfig.url, 'embedding'); | ||
| } | ||
| }, [embeddingInstanceConfig.url]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Remove hardcoded URL check to enable model loading for default instances.
The auto-load effects skip model loading for the default Ollama URL (http://host.docker.internal:11434/v1). Users with a working default instance won't see their available models without manually editing the URL.
Apply this diff:
useEffect(() => {
- if (llmInstanceConfig.url && llmInstanceConfig.url !== 'http://host.docker.internal:11434/v1') {
+ if (llmInstanceConfig.url && llmInstanceConfig.url.trim()) {
loadAvailableModels(llmInstanceConfig.url, 'chat');
}
}, [llmInstanceConfig.url]);
useEffect(() => {
- if (embeddingInstanceConfig.url && embeddingInstanceConfig.url !== 'http://host.docker.internal:11434/v1') {
+ if (embeddingInstanceConfig.url && embeddingInstanceConfig.url.trim()) {
loadAvailableModels(embeddingInstanceConfig.url, 'embedding');
}
}, [embeddingInstanceConfig.url]);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| useEffect(() => { | |
| if (llmInstanceConfig.url && llmInstanceConfig.url !== 'http://host.docker.internal:11434/v1') { | |
| loadAvailableModels(llmInstanceConfig.url, 'chat'); | |
| } | |
| }, [llmInstanceConfig.url]); | |
| useEffect(() => { | |
| if (embeddingInstanceConfig.url && embeddingInstanceConfig.url !== 'http://host.docker.internal:11434/v1') { | |
| loadAvailableModels(embeddingInstanceConfig.url, 'embedding'); | |
| } | |
| }, [embeddingInstanceConfig.url]); | |
| useEffect(() => { | |
| if (llmInstanceConfig.url && llmInstanceConfig.url.trim()) { | |
| loadAvailableModels(llmInstanceConfig.url, 'chat'); | |
| } | |
| }, [llmInstanceConfig.url]); | |
| useEffect(() => { | |
| if (embeddingInstanceConfig.url && embeddingInstanceConfig.url.trim()) { | |
| loadAvailableModels(embeddingInstanceConfig.url, 'embedding'); | |
| } | |
| }, [embeddingInstanceConfig.url]); |
🤖 Prompt for AI Agents
In archon-ui-main/src/components/settings/RAGSettings.tsx around lines 681 to
691, the useEffect hooks skip loading models when the URL equals the hardcoded
default Ollama URL; remove that hardcoded comparison so model loading runs for
any truthy URL. Update both effects to call loadAvailableModels(...) whenever
llmInstanceConfig.url or embeddingInstanceConfig.url is set (truthy), leaving
the dependency arrays unchanged. Ensure no other behavior is altered and retain
the 'chat' and 'embedding' type arguments.
| const updatedSettings = { | ||
| ...ragSettings, | ||
| LLM_BASE_URL: llmInstanceConfig.url, | ||
| OLLAMA_CHAT_AUTH_TOKEN: llmInstanceConfig.useAuth ? llmInstanceConfig.authToken : '' | ||
| }; | ||
| setRagSettings(updatedSettings); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Validate authentication tokens before persisting.
When useAuth is enabled but authToken is empty, the code still persists an empty token to ragSettings. This could lead to failed authentication attempts without clear feedback.
Add validation before saving:
<Button
onClick={async () => {
+ if (llmInstanceConfig.useAuth && !llmInstanceConfig.authToken.trim()) {
+ showToast('Authentication token is required when authentication is enabled', 'error');
+ return;
+ }
const updatedSettings = {
...ragSettings,
LLM_BASE_URL: llmInstanceConfig.url,
OLLAMA_CHAT_AUTH_TOKEN: llmInstanceConfig.useAuth ? llmInstanceConfig.authToken : ''
};Apply similar validation to the Embedding instance modal (lines 2490-2495).
Also applies to: 2490-2495
| logger.info("orphaned_worktree_deleted", path=orphan_path) | ||
| else: | ||
| # Safety check failed - do not delete | ||
| actions.append(f"Skipped deletion of {orphan_path} (safety check failed: outside worktree base or invalid path)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Verify line length compliance (target: ≤120 characters for Python backend code).
Line 174 appears to exceed the 120-character limit. Please count and refactor if needed by:
- Extracting the long message to a variable
- Breaking the line across multiple statements
If lines exceed 120 characters, consider extracting message strings to variables:
- actions.append(f"Skipped deletion of {orphan_path} (safety check failed: outside worktree base or invalid path)")
+ skip_msg = f"Skipped deletion of {orphan_path} (safety check failed: outside worktree base or invalid path)"
+ actions.append(skip_msg)🏁 Script executed:
#!/bin/bash
# Check line lengths in the reconcile_state function changes
# Get exact character count for lines around 174 and 176-183
sed -n '174p; 176,183p' python/src/agent_work_orders/utils/state_reconciliation.py | while IFS= read -r line; do
echo "$(echo "$line" | wc -c): $line"
doneLength of output: 581
Refactor line 174 to comply with 120-character line length limit.
Line 174 is 134 characters, exceeding the backend code standard by 14 characters. Extract the message to a variable:
skip_msg = f"Skipped deletion of {orphan_path} (safety check failed: outside worktree base or invalid path)"
actions.append(skip_msg)🤖 Prompt for AI Agents
In python/src/agent_work_orders/utils/state_reconciliation.py around line 174,
the f-string appended to actions exceeds the 120-character limit; extract the
message into a local variable (e.g., skip_msg) holding the formatted string and
then append that variable to actions to keep the line lengths within limit.
| def _get_embedding_adapter( | ||
| provider: str, client: Any, base_url: str | None = None, auth_token: str | None = None, api_mode: str | None = None | ||
| ) -> EmbeddingProviderAdapter: | ||
| provider_name = (provider or "").lower() | ||
| if provider_name == "google": | ||
| search_logger.info("Using Google native embedding adapter") | ||
| return GoogleEmbeddingAdapter() | ||
| if provider_name == "ollama" and base_url: | ||
| # Check API mode - default to native if not specified or if set to native | ||
| if (api_mode or "native") == "native": | ||
| search_logger.info(f"Using Ollama native API adapter with base URL: {base_url}") | ||
| return NativeOllamaEmbeddingAdapter(base_url, auth_token) | ||
| else: | ||
| search_logger.info(f"Using OpenAI-compatible adapter for Ollama with base URL: {getattr(client, 'base_url', 'N/A')}") | ||
| return OpenAICompatibleEmbeddingAdapter(client) | ||
| search_logger.info(f"Using OpenAI-compatible embedding adapter for provider: {provider_name}") | ||
| return OpenAICompatibleEmbeddingAdapter(client) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make api_mode handling case-insensitive in _get_embedding_adapter
Right now the Ollama branch does:
if (api_mode or "native") == "native":
...If a user stores "Native" or "NATIVE" in settings, this will silently fall through to the OpenAI‑compatible path. Normalizing once avoids brittle config:
-def _get_embedding_adapter(
- provider: str, client: Any, base_url: str | None = None, auth_token: str | None = None, api_mode: str | None = None
-) -> EmbeddingProviderAdapter:
+def _get_embedding_adapter(
+ provider: str,
+ client: Any,
+ base_url: str | None = None,
+ auth_token: str | None = None,
+ api_mode: str | None = None,
+) -> EmbeddingProviderAdapter:
@@
- if provider_name == "ollama" and base_url:
- # Check API mode - default to native if not specified or if set to native
- if (api_mode or "native") == "native":
+ if provider_name == "ollama" and base_url:
+ # Check API mode - default to native if not specified or if set to native (case-insensitive)
+ mode = (api_mode or "native").strip().lower()
+ if mode == "native":This keeps behavior predictable for users editing OLLAMA_API_MODE manually.
🤖 Prompt for AI Agents
In python/src/server/services/embeddings/embedding_service.py around lines 287
to 303, the check for Ollama's api_mode is case-sensitive so values like
"Native" or "NATIVE" will not match; normalize api_mode to lowercase before
comparing (e.g., use (api_mode or "native").lower() == "native") so the branch
selection is case-insensitive, and keep the logging messages unchanged.
Security fixes: - Remove hardcoded JWT token from INFRASTRUCTURE.md (replace with placeholder) - Remove real Ollama auth tokens from PLAN.md (replace with placeholders) - Fix secrets leak: Change RAG settings logging from info to debug level, log only keys Functional fixes: - Fix URL normalization for auth-token mapping (both incoming and configured URLs now use same normalization) Best practices: - Add test-results/ to .gitignore - Remove test-results from version control All 22 Python tests pass, server healthy. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
f668f36 to
80ca26d
Compare
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
PR coleam00#875: Ollama Support with Local Model Discovery - Security fixes: JWT tokens, auth tokens, RAG settings logging - Bug fixes: URL normalization, None-handling - Best practices: test-results gitignore All 7 items marked as ERLEDIGT (2025-11-21) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
This reverts commit d7b5ea6.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
PRPs/todo.md (2)
1-107: Clarify document purpose and placement.This appears to be a retrospective CodeRabbit review tracking document. Please clarify: is this intended to be a committed artifact in the repository, or should it be maintained elsewhere (e.g., PR description, internal wiki)? If it should remain in the repo, add a brief header explaining its purpose and intended audience.
99-99: Use a heading instead of emphasis.Line 99 uses bold text (
**Gesamt: 7 actionable Items**) where a markdown heading would be more appropriate for document structure.- **Gesamt: 7 actionable Items** + ## Gesamt: 7 actionable Items
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
PRPs/todo.md(1 hunks)
🧰 Additional context used
🪛 Gitleaks (8.29.0)
PRPs/todo.md
[high] 21-21: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
[high] 22-22: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
🪛 LanguageTool
PRPs/todo.md
[grammar] ~1-~1: Ersetze das Satzzeichen
Context: # CodeRabbit Review ToDos - PR #875 Quelle: CodeRabbit Review ...
(QB_NEW_DE_OTHER_ERROR_IDS_REPLACEMENT_PUNCTUATION_DASH_–)
[grammar] ~1-~1: Hier könnte ein Fehler sein.
Context: # CodeRabbit Review ToDos - PR #875 Quelle: CodeRabbit Review vom 21. Nove...
(QB_NEW_DE)
[grammar] ~3-~3: Ergänze ein Wort
Context: # CodeRabbit Review ToDos - PR #875 Quelle: CodeRabbit Review vom 21. November 20...
(QB_NEW_DE_OTHER_ERROR_IDS_MISSING_NOUN)
[grammar] ~4-~4: Hier könnte ein Fehler sein.
Context: ...ith Local Model Discovery and Embeddings --- ## KRITISCH (Sicherheit) ### 1. JWT Token ...
(QB_NEW_DE)
[grammar] ~8-~8: Hier könnte ein Fehler sein.
Context: ...mbeddings --- ## KRITISCH (Sicherheit) ### 1. JWT Token aus INFRASTRUCTURE.md entfe...
(QB_NEW_DE)
[grammar] ~10-~10: Ergänze ein Satzzeichen
Context: ...-- ## KRITISCH (Sicherheit) ### 1. JWT Token aus INFRASTRUCTURE.md entfernen - ...
(QB_NEW_DE_OTHER_ERROR_IDS_MISSING_PUNCTUATION_JWTDASHTOKEN)
[grammar] ~12-~12: Korrigiere das Wort
Context: ...RUCTURE.md` (Zeile ~113) - Problem: Hardcoded JWT Token mit 10-Jahres-Gültigkeit comm...
(QB_NEW_DE_OTHER_ERROR_IDS_REPLACEMENT_ORTHOGRAPHY_SPELLING)
[grammar] ~12-~12: Korrigiere das Wort
Context: ...Zeile ~113) - Problem: Hardcoded JWT Token mit 10-Jahres-Gültigkeit committed...
(QB_NEW_DE_OTHER_ERROR_IDS_REPLACEMENT_ORTHOGRAPHY_SPELLING)
[grammar] ~14-~14: Ersetze das Satzzeichen
Context: ...s auf Token-Generierungs-Script (Zeilen 142-157) hinzufügen - Status: [x] ERLEDIGT ...
(QB_NEW_DE_OTHER_ERROR_IDS_REPLACEMENT_PUNCTUATION_DASH_–)
[grammar] ~15-~15: Hier könnte ein Fehler sein.
Context: ... - Status: [x] ERLEDIGT (2025-11-21) ### 2. Auth Tokens aus PLAN.md entfernen - *...
(QB_NEW_DE)
[grammar] ~17-~17: Ergänze ein Satzzeichen
Context: ...* [x] ERLEDIGT (2025-11-21) ### 2. Auth Tokens aus PLAN.md entfernen - **Datei:*...
(QB_NEW_DE_OTHER_ERROR_IDS_MISSING_PUNCTUATION_AUTHDASHTOKENS)
[grammar] ~17-~17: Hier könnte ein Fehler sein.
Context: ...### 2. Auth Tokens aus PLAN.md entfernen - Datei: PLAN.md (Zeilen 111-261) - **Proble...
(QB_NEW_DE)
[grammar] ~18-~18: Ergänze ein Leerzeichen
Context: ...okens aus PLAN.md entfernen - Datei: PLAN.md (Zeilen 111-261) - **Problem:*...
(QB_NEW_DE_OTHER_ERROR_IDS_MISSING_ORTHOGRAPHY_SPACE)
[grammar] ~18-~18: Ersetze das Satzzeichen
Context: ...ntfernen - Datei: PLAN.md (Zeilen 111-261) - Problem: Echte Ollama Auth-Token...
(QB_NEW_DE_OTHER_ERROR_IDS_REPLACEMENT_PUNCTUATION_DASH_–)
[grammar] ~18-~18: Hier könnte ein Fehler sein.
Context: ... - Datei: PLAN.md (Zeilen 111-261) - Problem: Echte Ollama Auth-Tokens und Producti...
(QB_NEW_DE)
[grammar] ~19-~19: Passe das Symbol an
Context: ... PLAN.md (Zeilen 111-261) - Problem: Echte Ollama Auth-Tokens und Production...
(QB_NEW_DE_OTHER_ERROR_IDS_REPLACEMENT_OTHER)
[grammar] ~20-~20: Hier könnte ein Fehler sein.
Context: ...URL im DB-Snapshot - Tokens betroffen: - OLLAMA_CHAT_AUTH_TOKEN: ollama_13107e338aa16a6a8295592ce050f6cb - OLLAMA_EMBEDDING_AUTH_TOKEN: ollama_13107e338aa16a6a8295592ce050f6cb - Lösung: - Tokens durch Placeholder e...
(QB_NEW_DE)
[grammar] ~26-~26: Hier könnte ein Fehler sein.
Context: ...Server-seitig Tokens rotieren/revoken! - Status: [x] ERLEDIGT (2025-11-21) ###...
(QB_NEW_DE)
[grammar] ~27-~27: Ergänze ein Leerzeichen
Context: ...Tokens rotieren/revoken!** - Status: [x] ERLEDIGT (2025-11-21) ### 3. RAG Se...
(QB_NEW_DE_OTHER_ERROR_IDS_MISSING_ORTHOGRAPHY_SPACE)
[grammar] ~27-~27: Hier könnte ein Fehler sein.
Context: ... - Status: [x] ERLEDIGT (2025-11-21) ### 3. RAG Settings Logging entfernen (Secre...
(QB_NEW_DE)
[grammar] ~29-~29: Ergänze ein Satzzeichen
Context: ...** [x] ERLEDIGT (2025-11-21) ### 3. RAG Settings Logging entfernen (Secrets Leak...
(QB_NEW_DE_OTHER_ERROR_IDS_MISSING_PUNCTUATION_RAGDASHSETTINGSDASHLOGGING)
[grammar] ~29-~29: Ergänze ein Satzzeichen
Context: ...LEDIGT (2025-11-21) ### 3. RAG Settings Logging entfernen (Secrets Leak) - **Dat...
(QB_NEW_DE_OTHER_ERROR_IDS_MISSING_PUNCTUATION_RAGDASHSETTINGSDASHLOGGING)
[grammar] ~29-~29: Hier könnte ein Fehler sein.
Context: ...ettings Logging entfernen (Secrets Leak) - Datei: `python/src/server/api_routes/ollama_...
(QB_NEW_DE)
[grammar] ~30-~30: Ergänze ein Leerzeichen
Context: ...ng entfernen (Secrets Leak) - Datei: `python/src/server/api_routes/ollama_api...
(QB_NEW_DE_OTHER_ERROR_IDS_MISSING_ORTHOGRAPHY_SPACE)
[grammar] ~30-~30: Ersetze das Satzzeichen
Context: ...erver/api_routes/ollama_api.py(Zeilen 180-187) - **Problem:**logger.info(f"RAG setti...
(QB_NEW_DE_OTHER_ERROR_IDS_REPLACEMENT_PUNCTUATION_DASH_–)
[grammar] ~30-~30: Hier könnte ein Fehler sein.
Context: ...i_routes/ollama_api.py(Zeilen 180-187) - **Problem:**logger.info(f"RAG settings: {rag_set...
(QB_NEW_DE)
[grammar] ~31-~31: Ergänze ein Leerzeichen
Context: ..._api.py(Zeilen 180-187) - **Problem:**logger.info(f"RAG settings: {rag_settin...
(QB_NEW_DE_OTHER_ERROR_IDS_MISSING_ORTHOGRAPHY_SPACE)
[typographical] ~31-~31: Zeichen ohne sein Gegenstück: ‚"‘ scheint zu fehlen
Context: ...api.py(Zeilen 180-187) - **Problem:**logger.info(f"RAG settings: {rag_settings}")` logged a...
(DE_UNPAIRED_QUOTES)
[grammar] ~31-~31: Wähle ein passenderes Wort
Context: ...settings: {rag_settings}")` logged alle RAG settings inkl. Auth-Tokens - Lösung: Nur Key...
(QB_NEW_DE_OTHER_ERROR_IDS_REPLACEMENT_NOUN)
[grammar] ~31-~31: Ergänze ein Wort
Context: ...}")` logged alle RAG settings inkl. Auth-Tokens - Lösung: Nur Keys loggen, auf DEBU...
(QB_NEW_DE_OTHER_ERROR_IDS_MISSING_NOUN)
[grammar] ~31-~31: Hier könnte ein Fehler sein.
Context: ...gged alle RAG settings inkl. Auth-Tokens - Lösung: Nur Keys loggen, auf DEBUG Level: `...
(QB_NEW_DE)
[grammar] ~32-~32: Ergänze ein Leerzeichen
Context: ...settings inkl. Auth-Tokens - Lösung: Nur Keys loggen, auf DEBUG Level: ```p...
(QB_NEW_DE_OTHER_ERROR_IDS_MISSING_ORTHOGRAPHY_SPACE)
[grammar] ~32-~32: Korrigiere die Fehler
Context: ...- Lösung: Nur Keys loggen, auf DEBUG Level: ```python logger.debug(f"RAG ...
(QB_NEW_DE_OTHER_ERROR_IDS_REPLACEMENT_MULTITOKEN)
[grammar] ~32-~32: Hier könnte ein Fehler sein.
Context: ...ung:** Nur Keys loggen, auf DEBUG Level: python logger.debug(f"RAG settings keys: {list(rag_settings.keys())}") - Status: [x] ERLEDIGT (2025-11-21) ---...
(QB_NEW_DE)
[grammar] ~36-~36: Ergänze ein Leerzeichen
Context: ..._settings.keys())}") ``` - Status: [x] ERLEDIGT (2025-11-21) --- ## MAJOR...
(QB_NEW_DE_OTHER_ERROR_IDS_MISSING_ORTHOGRAPHY_SPACE)
[grammar] ~36-~36: Hier könnte ein Fehler sein.
Context: ... - Status: [x] ERLEDIGT (2025-11-21) --- ## MAJOR (Funktionale Probleme) ### 4. URL...
(QB_NEW_DE)
[grammar] ~40-~40: Passe die Groß- und Kleinschreibung an
Context: ...] ERLEDIGT (2025-11-21) --- ## MAJOR (Funktionale Probleme) ### 4. URL-Normalisierung fü...
(QB_NEW_DE_OTHER_ERROR_IDS_REPLACEMENT_ORTHOGRAPHY_UPPERCASE)
[grammar] ~40-~40: Hier könnte ein Fehler sein.
Context: ...1) --- ## MAJOR (Funktionale Probleme) ### 4. URL-Normalisierung für Auth-Token Map...
(QB_NEW_DE)
[grammar] ~42-~42: Entferne ein Leerzeichen
Context: ...### 4. URL-Normalisierung für Auth-Token Mapping fixen - Datei: `python/src/s...
(QB_NEW_DE_OTHER_ERROR_IDS_UNNECESSARY_ORTHOGRAPHY_SPACE)
[grammar] ~43-~43: Hier könnte ein Fehler sein.
Context: ...ierung für Auth-Token Mapping fixen - Datei: `python/src/server/api_routes/ollama_...
(QB_NEW_DE)
[grammar] ~43-~43: Ergänze ein Leerzeichen
Context: ...ür Auth-Token Mapping fixen - Datei: `python/src/server/api_routes/ollama_api...
(QB_NEW_DE_OTHER_ERROR_IDS_MISSING_ORTHOGRAPHY_SPACE)
[grammar] ~43-~43: Ergänze ein Wort
Context: ...r Auth-Token Mapping fixen - Datei: python/src/server/api_routes/ollama_api.py (Zeilen 99-109, 119-131, 137-142, 188-1...
(QB_NEW_DE_OTHER_ERROR_IDS_MISSING_NOUN)
[grammar] ~43-~43: Ersetze das Satzzeichen
Context: ...erver/api_routes/ollama_api.py` (Zeilen 99-109, 119-131, 137-142, 188-194, 200-217) - ...
(QB_NEW_DE_OTHER_ERROR_IDS_REPLACEMENT_PUNCTUATION_DASH_–)
[grammar] ~43-~43: Ersetze das Satzzeichen
Context: ...i_routes/ollama_api.py` (Zeilen 99-109, 119-131, 137-142, 188-194, 200-217) - **Problem...
(QB_NEW_DE_OTHER_ERROR_IDS_REPLACEMENT_PUNCTUATION_DASH_–)
[grammar] ~43-~43: Ersetze das Satzzeichen
Context: ...ollama_api.py(Zeilen 99-109, 119-131, 137-142, 188-194, 200-217) - **Problem:**/v1`...
(QB_NEW_DE_OTHER_ERROR_IDS_REPLACEMENT_PUNCTUATION_DASH_–)
[grammar] ~43-~43: Ersetze das Satzzeichen
Context: ...i.py(Zeilen 99-109, 119-131, 137-142, 188-194, 200-217) - **Problem:**/v1` wird von...
(QB_NEW_DE_OTHER_ERROR_IDS_REPLACEMENT_PUNCTUATION_DASH_–)
[grammar] ~43-~43: Ersetze das Satzzeichen
Context: ...ilen 99-109, 119-131, 137-142, 188-194, 200-217) - Problem: /v1 wird von konfigur...
(QB_NEW_DE_OTHER_ERROR_IDS_REPLACEMENT_PUNCTUATION_DASH_–)
[grammar] ~44-~44: Hier könnte ein Fehler sein.
Context: ...119-131, 137-142, 188-194, 200-217) - Problem: /v1 wird von konfigurierten URLs en...
(QB_NEW_DE)
[grammar] ~44-~44: Ergänze ein Leerzeichen
Context: ...37-142, 188-194, 200-217) - Problem: /v1 wird von konfigurierten URLs entfe...
(QB_NEW_DE_OTHER_ERROR_IDS_MISSING_ORTHOGRAPHY_SPACE)
[grammar] ~44-~44: Korrigiere die Fehler
Context: ...n URLs → Token-Lookup schlägt fehl → 401 Fehler - Lösung: Beide Seiten gleich...
(QB_NEW_DE_OTHER_ERROR_IDS_REPLACEMENT_MULTITOKEN)
[grammar] ~45-~45: Hier könnte ein Fehler sein.
Context: ...en-Lookup schlägt fehl → 401 Fehler - Lösung: Beide Seiten gleich normalisieren: ...
(QB_NEW_DE)
[grammar] ~45-~45: Ergänze ein Leerzeichen
Context: ... schlägt fehl → 401 Fehler - Lösung: Beide Seiten gleich normalisieren: ```...
(QB_NEW_DE_OTHER_ERROR_IDS_MISSING_ORTHOGRAPHY_SPACE)
[grammar] ~45-~45: Hier könnte ein Fehler sein.
Context: ...ng:** Beide Seiten gleich normalisieren: python normalized_url = url.replace("/v1", "").rstrip("/") - Status: [x] ERLEDIGT (2025-11-21) ###...
(QB_NEW_DE)
[grammar] ~49-~49: Ergänze ein Leerzeichen
Context: ...v1", "").rstrip("/") ``` - Status: [x] ERLEDIGT (2025-11-21) ### 5. Encryp...
(QB_NEW_DE_OTHER_ERROR_IDS_MISSING_ORTHOGRAPHY_SPACE)
[grammar] ~49-~49: Hier könnte ein Fehler sein.
Context: ... - Status: [x] ERLEDIGT (2025-11-21) ### 5. Encrypted Token Placeholder Problem -...
(QB_NEW_DE)
[grammar] ~57-~57: Hier könnte ein Fehler sein.
Context: ...ng:** Backend-only Decryption verwenden: python ollama_auth_token = await credential_service.get_credential( "OLLAMA_EMBEDDING_AUTH_TOKEN", default="", decrypt=True ) - Status: [x] ERLEDIGT (2025-11-21) - Au...
(QB_NEW_DE)
[grammar] ~63-~63: Ergänze ein Leerzeichen
Context: ..."", decrypt=True ) ``` - Status: [x] ERLEDIGT (2025-11-21) - Auth-Tokens ...
(QB_NEW_DE_OTHER_ERROR_IDS_MISSING_ORTHOGRAPHY_SPACE)
[grammar] ~63-~63: Hier könnte ein Fehler sein.
Context: ...``` - Status: [x] ERLEDIGT (2025-11-21) - Auth-Tokens sind nicht verschlüsselt,...
(QB_NEW_DE)
[grammar] ~63-~63: Passe das Symbol an
Context: ...- Status: [x] ERLEDIGT (2025-11-21) - Auth-Tokens sind nicht verschlüsselt, d...
(QB_NEW_DE_OTHER_ERROR_IDS_REPLACEMENT_OTHER)
[grammar] ~63-~63: Hier könnte ein Fehler sein.
Context: ...espeichert werden. Funktioniert korrekt. ### 6. OLLAMA_EMBEDDING_URL None-Handling - ...
(QB_NEW_DE)
[grammar] ~65-~65: Hier könnte ein Fehler sein.
Context: ...## 6. OLLAMA_EMBEDDING_URL None-Handling - Datei: `python/src/server/services/embedding...
(QB_NEW_DE)
[grammar] ~66-~66: Ergänze ein Leerzeichen
Context: ...EMBEDDING_URL None-Handling - Datei: `python/src/server/services/embeddings/e...
(QB_NEW_DE_OTHER_ERROR_IDS_MISSING_ORTHOGRAPHY_SPACE)
[grammar] ~66-~66: Ergänze ein Wort
Context: ...MBEDDING_URL None-Handling - Datei: python/src/server/services/embeddings/embedding_service.py (Zeilen 287-303, 433-469) - **Problem:*...
(QB_NEW_DE_OTHER_ERROR_IDS_MISSING_NOUN)
[grammar] ~66-~66: Ersetze das Satzzeichen
Context: ...mbeddings/embedding_service.py(Zeilen 287-303, 433-469) - **Problem:**.rstrip()` au...
(QB_NEW_DE_OTHER_ERROR_IDS_REPLACEMENT_PUNCTUATION_DASH_–)
[grammar] ~66-~66: Ersetze das Satzzeichen
Context: .../embedding_service.py(Zeilen 287-303, 433-469) - **Problem:**.rstrip()` auf None-We...
(QB_NEW_DE_OTHER_ERROR_IDS_REPLACEMENT_PUNCTUATION_DASH_–)
[grammar] ~66-~66: Hier könnte ein Fehler sein.
Context: ...ng_service.py(Zeilen 287-303, 433-469) - **Problem:**.rstrip()` auf None-Wert wirft Excep...
(QB_NEW_DE)
[grammar] ~67-~67: Ergänze ein Leerzeichen
Context: ...(Zeilen 287-303, 433-469) - Problem: .rstrip() auf None-Wert wirft Exceptio...
(QB_NEW_DE_OTHER_ERROR_IDS_MISSING_ORTHOGRAPHY_SPACE)
[grammar] ~67-~67: Ergänze ein Wort
Context: ...blem:** .rstrip() auf None-Wert wirft Exception - Lösung: ```python ollama_base...
(QB_NEW_DE_OTHER_ERROR_IDS_MISSING_NOUN)
[grammar] ~68-~68: Passe das Symbol an
Context: ...ip()` auf None-Wert wirft Exception - Lösung: python ollama_base_url_raw = (rag_settings.get("OLLAMA_EMBEDDING_URL") or "").strip() ollama_base_url = ollama_base_url_raw.rstrip("/v1").rstrip("/") if ollama_base_url_raw else "" ollama_api_mode = (rag_settings.get("OLLAMA_API_MODE") or "native").lower() - Zusatz: Ollama-spezifische Werte nur b...
(QB_NEW_DE_OTHER_ERROR_IDS_REPLACEMENT_OTHER)
[grammar] ~74-~74: Hier könnte ein Fehler sein.
Context: ...I_MODE") or "native").lower() ``` - Zusatz: Ollama-spezifische Werte nur berechne...
(QB_NEW_DE)
[grammar] ~74-~74: Ergänze ein Leerzeichen
Context: ...or "native").lower() ``` - Zusatz: Ollama-spezifische Werte nur berechnen w...
(QB_NEW_DE_OTHER_ERROR_IDS_MISSING_ORTHOGRAPHY_SPACE)
[grammar] ~74-~74: Ergänze ein Satzzeichen
Context: ...Zusatz:* Ollama-spezifische Werte nur berechnen wenn `embedding_provider.lower() == "ol...
(QB_NEW_DE_OTHER_ERROR_IDS_MISSING_PUNCTUATION_COMMA)
[grammar] ~75-~75: Ersetze das Satzzeichen
Context: ...- Status: [x] ERLEDIGT (2025-11-21) - Code verwendet bereits (... or "") Pa...
(QB_NEW_DE_OTHER_ERROR_IDS_REPLACEMENT_PUNCTUATION_DASH_–)
[grammar] ~75-~75: Ergänze ein Satzzeichen
Context: ...) - Code verwendet bereits (... or "") Pattern für None-Safety --- ## MINOR (...
(QB_NEW_DE_OTHER_ERROR_IDS_MISSING_PUNCTUATION_DASHPATTERN)
[grammar] ~75-~75: Hier könnte ein Fehler sein.
Context: ...ts (... or "") Pattern für None-Safety --- ## MINOR (Best Practices) ### 7. Test-Resu...
(QB_NEW_DE)
[grammar] ~79-~79: Hier könnte ein Fehler sein.
Context: ...e-Safety --- ## MINOR (Best Practices) ### 7. Test-Results aus Version Control entf...
(QB_NEW_DE)
[grammar] ~85-~85: Hier könnte ein Fehler sein.
Context: ...ung:** 1. Zu .gitignore hinzufügen: test-results/ 2. git rm --cached archon-ui-main/test-results/ - Status: [x] ERLEDIGT (2025-11-21) ---...
(QB_NEW_DE)
[grammar] ~87-~87: Ergänze ein Leerzeichen
Context: ...hon-ui-main/test-results/` - Status: [x] ERLEDIGT (2025-11-21) --- ## Zusam...
(QB_NEW_DE_OTHER_ERROR_IDS_MISSING_ORTHOGRAPHY_SPACE)
[grammar] ~87-~87: Hier könnte ein Fehler sein.
Context: ... - Status: [x] ERLEDIGT (2025-11-21) --- ## Zusammenfassung | Priorität | Anzahl | ...
(QB_NEW_DE)
[grammar] ~91-~91: Hier könnte ein Fehler sein.
Context: ...GT (2025-11-21) --- ## Zusammenfassung | Priorität | Anzahl | Beschreibung | |-...
(QB_NEW_DE)
[grammar] ~97-~97: Hier könnte ein Fehler sein.
Context: ... | 1 | Best Practices (gitignore) | Gesamt: 7 actionable Items --- ## No...
(QB_NEW_DE)
[grammar] ~99-~99: Hier könnte ein Fehler sein.
Context: ...tignore) | Gesamt: 7 actionable Items --- ## Notizen - Die Grammar/Language-Tool War...
(QB_NEW_DE)
[grammar] ~103-~103: Hier könnte ein Fehler sein.
Context: ...t: 7 actionable Items** --- ## Notizen - Die Grammar/Language-Tool Warnungen (deu...
(QB_NEW_DE)
[grammar] ~105-~105: Entferne ein Leerzeichen
Context: ... ## Notizen - Die Grammar/Language-Tool Warnungen (deutsche Rechtschreibung in P...
(QB_NEW_DE_OTHER_ERROR_IDS_UNNECESSARY_ORTHOGRAPHY_SPACE)
[grammar] ~105-~105: Füge ein passenderes Wort ein
Context: ...n (deutsche Rechtschreibung in PLAN.md) sind niedrige Priorität - Nach Behebung der ...
(QB_NEW_DE_OTHER_ERROR_IDS_REPLACEMENT_VERB)
[grammar] ~105-~105: Ergänze ein Satzzeichen
Context: ...chtschreibung in PLAN.md) sind niedrige Priorität - Nach Behebung der kritischen Issues: ...
(QB_NEW_DE_OTHER_ERROR_IDS_MISSING_PUNCTUATION_PERIOD)
[grammar] ~106-~106: Ergänze ein Satzzeichen
Context: ...: Server-seitig alle exponierten Tokens rotieren - Repo-weite Suche nach ähnlichen Token...
(QB_NEW_DE_OTHER_ERROR_IDS_MISSING_PUNCTUATION_PERIOD)
[grammar] ~107-~107: Hier könnte ein Fehler sein.
Context: ...ch ähnlichen Token/URL-Leaks durchführen
(QB_NEW_DE)
🪛 markdownlint-cli2 (0.18.1)
PRPs/todo.md
99-99: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
🔇 Additional comments (1)
PRPs/todo.md (1)
21-22: No security risk—this document is tracking completed remediation work, not storing active credentials.The token values shown on lines 21–22 are historical documentation of a security issue that has already been fixed. Verification confirms:
- The token does not appear anywhere else in the repository
- PLAN.md (the file where tokens were originally exposed) has already been sanitized with
ollama_xxx_placeholder_tokenplaceholders- Git history shows the fix was completed: "fix(security): Address CodeRabbit review findings"
- PRPs/todo.md is a German-language retrospective tracking document describing problems found and their solutions
This file documents that the issue was remediated server-side (as noted: "Server-seitig Tokens rotieren/revoken!"), not that secrets are currently exposed.
Likely an incorrect or invalid review comment.
|
Sorry, I forgot to clean the todo.md. This was just for the implementaion of this ollama-fix /-overhaul.
Am 23. Nov. 2025, 20:12 +0100 schrieb:
…
placement.
This appears to be a retrospective CodeRabbit review tracking document. Please clarify: is this intended to be a committed artifact in the repository
|
- Add auth_token parameter to validate_provider_instance() function - Load auth tokens from RAG settings in /validate endpoint - Match incoming URLs against configured LLM_BASE_URL and OLLAMA_EMBEDDING_URL - Apply correct auth token (chat or embedding) based on URL matching - Add tests for auth token flow through all layers 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
When "Use same host for embedding instance" is checked, now also copies useAuth and authToken from LLM config to embedding config, not just the URL and name. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
- Extract syncEmbeddingFromLLM() into testable utility function - Add 6 tests covering: full sync, default name, custom default, auth settings copy, empty token handling, immutability - Refactor RAGSettings to use the utility function 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
python/src/server/api_routes/ollama_api.py (3)
254-256: Bug:timestampreturns module name instead of actual timestamp.
model_discovery_service.check_instance_health.__module__returns the module path string (e.g.,"src.server.services.ollama.model_discovery_service"), not a timestamp.- "timestamp": model_discovery_service.check_instance_health.__module__ # Use current timestamp + "timestamp": datetime.now().isoformat()
510-517: Missing auth token support indiscover_and_store_models_endpoint.Unlike
discover_models_endpointandvalidate_instance_endpoint, this endpoint doesn't extract auth tokens from RAG settings. Calls todiscover_models()at line 517 will fail on protected Ollama instances.Consider adding auth token extraction similar to other endpoints:
# Get auth tokens from RAG settings from ..services.credential_service import credential_service rag_settings = await credential_service.get_credentials_by_category("rag_strategy") llm_base_url = (rag_settings.get("LLM_BASE_URL") or "").replace("/v1", "").rstrip("/") embedding_base_url = (rag_settings.get("OLLAMA_EMBEDDING_URL") or "").replace("/v1", "").rstrip("/") chat_auth_token = rag_settings.get("OLLAMA_CHAT_AUTH_TOKEN") or "" embedding_auth_token = rag_settings.get("OLLAMA_EMBEDDING_AUTH_TOKEN") or "" # Then pass auth_token to discover_models based on URL matching
1056-1065: Missing auth token support indiscover_models_with_real_details.The direct
httpx.AsyncClientrequest to/api/tagsdoesn't include Authorization headers. This will fail on protected Ollama instances.+ # Get auth token for this instance + normalized_url = base_url.replace("/v1", "").rstrip("/") + auth_token = None + if normalized_url == llm_base_url and chat_auth_token: + auth_token = chat_auth_token + elif normalized_url == embedding_base_url and embedding_auth_token: + auth_token = embedding_auth_token + + headers = {} + if auth_token: + headers["Authorization"] = f"Bearer {auth_token}" + async with httpx.AsyncClient(timeout=httpx.Timeout(5.0)) as client: - tags_response = await client.get(f"{base_url}/api/tags") + tags_response = await client.get(f"{base_url}/api/tags", headers=headers)
♻️ Duplicate comments (4)
python/tests/test_ollama_auth_token.py (1)
170-180: Same patch target issue as above.This test has the same patching concern.
archon-ui-main/src/components/settings/RAGSettings.tsx (3)
142-176: Add missing Ollama auth/API fields toRAGSettingsPropsand removeas anycasts
ragSettingsis accessed with properties that are not declared on its type:
OLLAMA_CHAT_AUTH_TOKEN(lines 231, 233–235, 243, 249)OLLAMA_EMBEDDING_AUTH_TOKEN(lines 254, 256–258, 266, 272)OLLAMA_API_MODE(lines 1605–1607, 1613–1618, 1631–1633, 1638–1643)This forces usage of
(ragSettings as any)and violates the strict-typing guideline. It also makes it easy to accidentally break these flows without a type error.Define the fields on
ragSettingsand then remove theas anycasts and expression dependencies:interface RAGSettingsProps { ragSettings: { MODEL_CHOICE: string; CHAT_MODEL?: string; + OLLAMA_CHAT_AUTH_TOKEN?: string; + OLLAMA_EMBEDDING_AUTH_TOKEN?: string; + OLLAMA_API_MODE?: 'native' | 'openai-compatible'; USE_CONTEXTUAL_EMBEDDINGS: boolean; @@ - const lastLLMConfigRef = useRef({ url: '', name: '', authToken: '' }); - const lastEmbeddingConfigRef = useRef({ url: '', name: '', authToken: '' }); + const lastLLMConfigRef = useRef({ url: '', name: '', authToken: '' }); + const lastEmbeddingConfigRef = useRef({ url: '', name: '', authToken: '' }); @@ - const newAuthToken = (ragSettings as any).OLLAMA_CHAT_AUTH_TOKEN || ''; + const newAuthToken = ragSettings.OLLAMA_CHAT_AUTH_TOKEN || ''; @@ - }, [ragSettings.LLM_BASE_URL, ragSettings.LLM_INSTANCE_NAME, (ragSettings as any).OLLAMA_CHAT_AUTH_TOKEN]); + }, [ragSettings.LLM_BASE_URL, ragSettings.LLM_INSTANCE_NAME, ragSettings.OLLAMA_CHAT_AUTH_TOKEN]); @@ - const newAuthToken = (ragSettings as any).OLLAMA_EMBEDDING_AUTH_TOKEN || ''; + const newAuthToken = ragSettings.OLLAMA_EMBEDDING_AUTH_TOKEN || ''; @@ - }, [ragSettings.OLLAMA_EMBEDDING_URL, ragSettings.OLLAMA_EMBEDDING_INSTANCE_NAME, (ragSettings as any).OLLAMA_EMBEDDING_AUTH_TOKEN]); + }, [ragSettings.OLLAMA_EMBEDDING_URL, ragSettings.OLLAMA_EMBEDDING_INSTANCE_NAME, ragSettings.OLLAMA_EMBEDDING_AUTH_TOKEN]);This keeps the Ollama auth/API-mode surface properly typed and removes the need for
any.Also applies to: 225-227, 231-235, 243-244, 249-249, 254-258, 266-272, 1596-1653
682-692: Remove hardcoded default URL guard so models load for the default instanceThe auto-load effects currently skip model loading for the default Ollama URL:
if (llmInstanceConfig.url && llmInstanceConfig.url !== 'http://host.docker.internal:11434/v1') { loadAvailableModels(llmInstanceConfig.url, 'chat'); }Users running the default instance will not see available models auto-populated until they change the URL, which is confusing.
You can safely load models for any non-empty URL:
- useEffect(() => { - if (llmInstanceConfig.url && llmInstanceConfig.url !== 'http://host.docker.internal:11434/v1') { - loadAvailableModels(llmInstanceConfig.url, 'chat'); - } - }, [llmInstanceConfig.url]); + useEffect(() => { + if (llmInstanceConfig.url?.trim()) { + void loadAvailableModels(llmInstanceConfig.url, 'chat'); + } + }, [llmInstanceConfig.url]); @@ - useEffect(() => { - if (embeddingInstanceConfig.url && embeddingInstanceConfig.url !== 'http://host.docker.internal:11434/v1') { - loadAvailableModels(embeddingInstanceConfig.url, 'embedding'); - } - }, [embeddingInstanceConfig.url]); + useEffect(() => { + if (embeddingInstanceConfig.url?.trim()) { + void loadAvailableModels(embeddingInstanceConfig.url, 'embedding'); + } + }, [embeddingInstanceConfig.url]);This aligns with earlier feedback and improves the out-of-the-box experience.
2360-2383: Validate auth tokens before saving whenuseAuthis enabledRight now, when
useAuthis checked but the token is empty or whitespace, the Save handlers still persist an empty token:
- LLM modal Save (lines 2395–2401)
- Embedding modal Save (lines 2485–2490)
This leads to a configuration that looks authenticated in the UI but always fails at runtime, without immediate feedback.
Add a simple validation gate before building
updatedSettingsin both handlers, e.g.:// LLM modal Save const Button onClick={async () => { - const updatedSettings = { - ...ragSettings, - LLM_BASE_URL: llmInstanceConfig.url, - OLLAMA_CHAT_AUTH_TOKEN: llmInstanceConfig.useAuth ? llmInstanceConfig.authToken : '' - }; + if (llmInstanceConfig.useAuth && !llmInstanceConfig.authToken.trim()) { + showToast('Authentication token is required when authentication is enabled', 'error'); + return; + } + + const updatedSettings = { + ...ragSettings, + LLM_BASE_URL: llmInstanceConfig.url, + OLLAMA_CHAT_AUTH_TOKEN: llmInstanceConfig.useAuth ? llmInstanceConfig.authToken.trim() : '' + }; @@ // Embedding modal Save - const updatedSettings = { - ...ragSettings, - OLLAMA_EMBEDDING_URL: embeddingInstanceConfig.url, - OLLAMA_EMBEDDING_AUTH_TOKEN: embeddingInstanceConfig.useAuth ? embeddingInstanceConfig.authToken : '' - }; + if (embeddingInstanceConfig.useAuth && !embeddingInstanceConfig.authToken.trim()) { + showToast('Authentication token is required when authentication is enabled', 'error'); + return; + } + + const updatedSettings = { + ...ragSettings, + OLLAMA_EMBEDDING_URL: embeddingInstanceConfig.url, + OLLAMA_EMBEDDING_AUTH_TOKEN: embeddingInstanceConfig.useAuth ? embeddingInstanceConfig.authToken.trim() : '' + };This prevents “silent failure” configs and gives users actionable feedback.
Also applies to: 2395-2402, 2450-2473, 2485-2491
🧹 Nitpick comments (3)
python/tests/test_ollama_auth_token.py (2)
30-40: Unused fixture:mock_credential_serviceThis fixture is defined but never used in any of the tests in this file. Either remove it or add tests that utilize credential service mocking for more comprehensive coverage.
- @pytest.fixture - def mock_credential_service(self): - """Mock credential service with auth token settings.""" - mock_service = MagicMock() - mock_service.get_credentials_by_category = AsyncMock(return_value={ - "LLM_BASE_URL": "http://localhost:11434", - "OLLAMA_EMBEDDING_URL": "http://localhost:11434", - "OLLAMA_CHAT_AUTH_TOKEN": "test-chat-token", - "OLLAMA_EMBEDDING_AUTH_TOKEN": "test-embedding-token", - }) - return mock_service -
62-64: Consider a clearer assertion forauth_tokenverification.The
orlogic works but makes debugging harder when tests fail. Consider checking which argument style was actually used:- # Check that auth_token was passed as keyword argument - assert call_args.kwargs.get("auth_token") == "test-auth-token" or \ - (len(call_args.args) > 1 and call_args.args[1] == "test-auth-token") + # Check that auth_token was passed (either as kwarg or positional) + auth_token_in_kwargs = call_args.kwargs.get("auth_token") == "test-auth-token" + auth_token_in_args = len(call_args.args) > 1 and call_args.args[1] == "test-auth-token" + assert auth_token_in_kwargs or auth_token_in_args, ( + f"auth_token not found in call. kwargs={call_args.kwargs}, args={call_args.args}" + )archon-ui-main/src/components/settings/RAGSettings.tsx (1)
212-221: Optionally DRY up the default Ollama URL usageThe default Ollama URL literal
'http://host.docker.internal:11434/v1'appears in several places despite the presence ofDEFAULT_OLLAMA_URL:
- Initial
llmInstanceConfigandembeddingInstanceConfig(lines 212, 218)- Auto-load effects (to be refactored per earlier comment)
- Page-load testing fallbacks (lines 1253–1263, 1285–1293)
To keep things consistent and easier to change later, you could centralize on
DEFAULT_OLLAMA_URL:const [llmInstanceConfig, setLLMInstanceConfig] = useState({ name: '', - url: ragSettings.LLM_BASE_URL || 'http://host.docker.internal:11434/v1', + url: ragSettings.LLM_BASE_URL || DEFAULT_OLLAMA_URL, useAuth: false, authToken: '' }); @@ const [embeddingInstanceConfig, setEmbeddingInstanceConfig] = useState({ name: '', - url: ragSettings.OLLAMA_EMBEDDING_URL || 'http://host.docker.internal:11434/v1', + url: ragSettings.OLLAMA_EMBEDDING_URL || DEFAULT_OLLAMA_URL, useAuth: false, authToken: '' });Not critical, but it avoids subtle drift if the default host ever needs to change.
Also applies to: 682-692, 1253-1263, 1285-1293
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
archon-ui-main/src/components/settings/RAGSettings.tsx(17 hunks)archon-ui-main/src/components/settings/utils/__tests__/instanceConfigSync.test.ts(1 hunks)archon-ui-main/src/components/settings/utils/instanceConfigSync.ts(1 hunks)python/src/server/api_routes/ollama_api.py(4 hunks)python/src/server/services/llm_provider_service.py(5 hunks)python/tests/test_ollama_auth_token.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (11)
archon-ui-main/src/**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
archon-ui-main/src/**/*.{ts,tsx}: Use TypeScript in strict mode with no implicit any in frontend code
Do not use dynamic Tailwind class construction in frontend components
Files:
archon-ui-main/src/components/settings/utils/instanceConfigSync.tsarchon-ui-main/src/components/settings/utils/__tests__/instanceConfigSync.test.tsarchon-ui-main/src/components/settings/RAGSettings.tsx
archon-ui-main/src/!(features)/**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
Use ESLint with standard React rules for legacy code in the frontend
Files:
archon-ui-main/src/components/settings/utils/instanceConfigSync.tsarchon-ui-main/src/components/settings/utils/__tests__/instanceConfigSync.test.tsarchon-ui-main/src/components/settings/RAGSettings.tsx
archon-ui-main/src/**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (AGENTS.md)
Use ESLint for legacy frontend code outside /src/features directory with standard React rules
Files:
archon-ui-main/src/components/settings/utils/instanceConfigSync.tsarchon-ui-main/src/components/settings/utils/__tests__/instanceConfigSync.test.tsarchon-ui-main/src/components/settings/RAGSettings.tsx
python/src/server/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
python/src/server/**/*.py: Fail fast and loud for service startup failures, missing configuration, database connection failures, authentication/authorization failures, data corruption, critical dependency unavailability, and invalid data that would corrupt state
Never accept corrupted data - skip failed items entirely rather than storing corrupted or incomplete data
Files:
python/src/server/api_routes/ollama_api.pypython/src/server/services/llm_provider_service.py
python/src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
python/src/**/*.py: Complete batch processing, background tasks, and external API calls but log detailed errors for each failure instead of crashing
Include context about what was being attempted, preserve full stack traces with exc_info=True in Python logging, use specific exception types, include relevant IDs/URLs, never return None to indicate failure, and report both success count and detailed failure list for batch operations
Use Ruff for linting and MyPy for type checking with 120 character line length in Python backend
Use Python 3.12 with 120 character line length in backend code
Import GITHUB_REPO_OWNER and GITHUB_REPO_NAME constants from python/src/server/config/version.py as the single source of truth for repository configuration
python/src/**/*.py: Python backend code should use 120 character line length
Use Ruff for linting Python code to check for errors, warnings, and unused imports
Use Mypy for type checking in Python to ensure type safety
Files:
python/src/server/api_routes/ollama_api.pypython/src/server/services/llm_provider_service.py
python/src/server/api_routes/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
python/src/server/api_routes/**/*.py: Follow the API route → Service → Database pattern for new API endpoints
Use database values directly without frontend mapping to maintain type-safe end-to-end communication from backend upward
Files:
python/src/server/api_routes/ollama_api.py
python/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Error messages must include context about what was being attempted, preserve full stack traces, use specific exception types, include relevant IDs/URLs, and report both success count and detailed failure list for batch operations
Files:
python/src/server/api_routes/ollama_api.pypython/tests/test_ollama_auth_token.pypython/src/server/services/llm_provider_service.py
archon-ui-main/src/**/*.{test,spec}.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
Use Vitest with React Testing Library for frontend testing
Files:
archon-ui-main/src/components/settings/utils/__tests__/instanceConfigSync.test.ts
archon-ui-main/**/*.test.{ts,tsx}
📄 CodeRabbit inference engine (AGENTS.md)
Use Vitest with React Testing Library for frontend testing
Files:
archon-ui-main/src/components/settings/utils/__tests__/instanceConfigSync.test.ts
python/tests/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Use Pytest for testing with async support in Python backend
Use Pytest for testing Python code with async support
Files:
python/tests/test_ollama_auth_token.py
python/src/server/services/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
python/src/server/services/**/*.py: For batch processing and background tasks, complete what you can and report detailed failures for each item rather than crashing
Never accept corrupted data - skip failed items entirely rather than storing corrupted data in batch operations
Files:
python/src/server/services/llm_provider_service.py
🧠 Learnings (7)
📓 Common learnings
Learnt from: zaksnet
Repo: coleam00/Archon PR: 473
File: python/src/providers_clean/api/routes/post_migrate_model_config.py:7-9
Timestamp: 2025-09-10T06:45:10.853Z
Learning: User zaksnet prefers to handle authentication/authorization in separate PRs rather than mixing security infrastructure with feature development.
📚 Learning: 2025-11-29T19:25:01.813Z
Learnt from: CR
Repo: coleam00/Archon PR: 0
File: python/src/agent_work_orders/CLAUDE.md:0-0
Timestamp: 2025-11-29T19:25:01.813Z
Learning: Applies to python/src/agent_work_orders/**/*.py : Do not log sensitive data such as passwords, API keys, or tokens; mask them using format like `api_key[:8] + "..."`
Applied to files:
python/src/server/api_routes/ollama_api.py
📚 Learning: 2025-11-29T19:25:23.987Z
Learnt from: CR
Repo: coleam00/Archon PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-29T19:25:23.987Z
Learning: Applies to archon-ui-main/**/*.test.{ts,tsx} : Use Vitest with React Testing Library for frontend testing
Applied to files:
archon-ui-main/src/components/settings/utils/__tests__/instanceConfigSync.test.ts
📚 Learning: 2025-11-29T19:24:36.223Z
Learnt from: CR
Repo: coleam00/Archon PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-29T19:24:36.223Z
Learning: Applies to archon-ui-main/src/**/*.{test,spec}.{ts,tsx} : Use Vitest with React Testing Library for frontend testing
Applied to files:
archon-ui-main/src/components/settings/utils/__tests__/instanceConfigSync.test.ts
📚 Learning: 2025-11-29T19:24:36.223Z
Learnt from: CR
Repo: coleam00/Archon PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-29T19:24:36.223Z
Learning: Applies to archon-ui-main/src/**/*.{ts,tsx} : Use TypeScript in strict mode with no implicit any in frontend code
Applied to files:
archon-ui-main/src/components/settings/RAGSettings.tsx
📚 Learning: 2025-11-29T19:25:23.987Z
Learnt from: CR
Repo: coleam00/Archon PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-29T19:25:23.987Z
Learning: Applies to archon-ui-main/src/features/**/*.{ts,tsx} : TypeScript strict mode enabled with no implicit any, and use Biome with 120 character lines and double quotes in /src/features directory
Applied to files:
archon-ui-main/src/components/settings/RAGSettings.tsx
📚 Learning: 2025-09-25T16:36:48.527Z
Learnt from: Chillbruhhh
Repo: coleam00/Archon PR: 736
File: archon-ui-main/src/components/settings/RAGSettings.tsx:17-19
Timestamp: 2025-09-25T16:36:48.527Z
Learning: OpenRouter does not provide embedding models - it only acts as a proxy/router for various LLM providers for chat completions, not embeddings. The EMBEDDING_CAPABLE_PROVIDERS array in RAGSettings.tsx correctly excludes 'openrouter'.
Applied to files:
archon-ui-main/src/components/settings/RAGSettings.tsx
🧬 Code graph analysis (4)
python/src/server/api_routes/ollama_api.py (2)
python/src/server/services/credential_service.py (1)
get_credentials_by_category(323-369)python/src/server/services/ollama/model_discovery_service.py (3)
discover_models_from_multiple_instances(1026-1140)check_instance_health(964-1024)discover_models(122-243)
archon-ui-main/src/components/settings/utils/__tests__/instanceConfigSync.test.ts (1)
archon-ui-main/src/components/settings/utils/instanceConfigSync.ts (2)
InstanceConfig(5-10)syncEmbeddingFromLLM(18-28)
python/src/server/services/llm_provider_service.py (3)
python/src/server/services/credential_service.py (1)
get_credentials_by_category(323-369)python/src/server/api_routes/settings_api.py (1)
get_credentials_by_category(78-93)python/src/server/services/ollama/model_discovery_service.py (1)
check_instance_health(964-1024)
archon-ui-main/src/components/settings/RAGSettings.tsx (2)
archon-ui-main/src/components/settings/utils/instanceConfigSync.ts (1)
syncEmbeddingFromLLM(18-28)archon-ui-main/src/features/ui/primitives/input.tsx (1)
Input(8-29)
🔇 Additional comments (14)
python/src/server/api_routes/ollama_api.py (3)
99-143: Auth token propagation looks correct.The token extraction from RAG settings, URL normalization, and mapping to
auth_tokens_mapis implemented correctly. The normalized URL comparison ensures consistent token matching.
181-218: Health check auth token handling is correct.The implementation properly extracts tokens from RAG settings, normalizes URLs for consistent matching, and passes the appropriate token to
check_instance_health(). Debug logging only outputs keys/URLs, not secret values.
277-308: Validation endpoint correctly propagates auth tokens.The auth token is properly extracted, matched to the appropriate URL, and passed through both
validate_provider_instance()anddiscover_models()calls.python/src/server/services/llm_provider_service.py (4)
415-441: Ollama fallback auth token handling looks correct.The token selection based on
use_embedding_provideris appropriate, and the "required-but-ignored" fallback handles Ollama instances without auth enabled. Token values are not logged (only length would be logged in the main Ollama path).
455-480: Ollama auth token handling is well implemented.Token selection based on operation type (
use_embedding_providerorinstance_type == "embedding") is correct. Logging token length (not the token itself) is appropriate for debugging while maintaining security. Based on learnings, this follows the guideline to "mask [tokens] using format likeapi_key[:8] + \"...\"" — logging only length is similarly safe.
616-622: Good addition of debugging logs for URL resolution.The logging at lines 618 and 621-622 will help diagnose embedding URL configuration issues without exposing sensitive data.
1211-1235:validate_provider_instancecorrectly extended withauth_tokensupport.The function signature, docstring, and implementation properly handle the optional
auth_tokenparameter and pass it through tocheck_instance_health().python/tests/test_ollama_auth_token.py (1)
133-144: Patch target may not intercept the actualhttpx.AsyncClientusage.The patch should target where
httpx.AsyncClientis used (in the service module), not where it's defined. Verify the import style inmodel_discovery_service.py—if it usesimport httpx, the patch target should be the full qualified path:src.server.services.ollama.model_discovery_service.httpx.AsyncClient.archon-ui-main/src/components/settings/utils/instanceConfigSync.ts (1)
5-28: Utility and typing look solid for config mirroring
InstanceConfigandsyncEmbeddingFromLLMare minimal, type-safe, and behave as expected (copy + default name fallback, no mutation). They fit the intended “single source of truth” pattern for LLM/Embedding instance configs.archon-ui-main/src/components/settings/utils/__tests__/instanceConfigSync.test.ts (1)
1-90: Good, comprehensive tests aligned with utility behaviorThese Vitest tests exercise the main paths (field copy, default/custom names, auth edge cases, immutability) and match
syncEmbeddingFromLLM’s contract. They also follow the Vitest testing guideline for frontend code.archon-ui-main/src/components/settings/RAGSettings.tsx (4)
571-585: Ollama metrics and model loading wiring looks goodThe expanded
ollamaMetricsstate (per-instance counts + available model lists) andloadAvailableModelshelper are cleanly structured and reused from both auto-load effects andmanualTestConnection. Using instance URL normalization and separate chat/embedding contexts should make the UI’s “available models” views reliable.Also applies to: 641-680, 885-903
189-221: Propagating auth settings and usingsyncEmbeddingFromLLMis a nice UX improvementThe new
useAuth/authTokenfields onllmInstanceConfigandembeddingInstanceConfig, coupled with:
- the “Use same host for embedding instance” checkbox wired to
syncEmbeddingFromLLM(llmInstanceConfig)(line 2350), and- per-modal auth sections with password inputs (lines 2360–2383, 2450–2473), and
- persistence into
ragSettingsviaOLLAMA_CHAT_AUTH_TOKEN/OLLAMA_EMBEDDING_AUTH_TOKEN(lines 2396–2400, 2486–2490),make the Ollama auth story much clearer and consistent across chat and embeddings.
Once the typings for those ragSettings fields are added (see earlier comment), this whole flow will be in good shape.
Also applies to: 2350-2351, 2450-2473, 2485-2491
1596-1653: API mode toggle UX for Ollama is clear and self-containedThe new “Ollama API Mode” section with explicit “Native Ollama API” vs “OpenAI-Compatible” buttons is a nice addition:
- State is stored on
ragSettings.OLLAMA_API_MODE.- Visual selection and radio-circle affordance make the current mode unambiguous.
- Helper text (“Uses /api/embeddings endpoint” vs “Uses /v1/embeddings endpoint”) sets expectations well.
Once the
OLLAMA_API_MODEfield is added toRAGSettingsProps, this section is in good shape.
145-145: The specific rendering issue described does not appear in the current codeThe exact code patterns referenced in this review comment—
ragSettings.CHAT_MODEL || <span className="text-gray-500 italic">Not selected</span>rendering,activeSelection === 'chat'logic, and thegetDisplayedChatModel()function—do not exist in the current RAGSettings.tsx implementation. Line 145 only contains the field definitionCHAT_MODEL?: string;as part of a type, not active rendering logic.If there is actual misalignment between
CHAT_MODELandMODEL_CHOICEfield usage in the component, it should be verified against the actual current implementation and resubmitted with specifics about where in the code the issue manifests.Likely an incorrect or invalid review comment.
|
|
||
|
|
||
| async def validate_provider_instance(provider: str, instance_url: str | None = None) -> dict[str, any]: | ||
| async def validate_provider_instance(provider: str, instance_url: str | None = None, auth_token: str | None = None) -> dict[str, any]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Type hint uses any instead of Any.
The return type annotation dict[str, any] uses lowercase any, which refers to the built-in any() function, not the Any type. This may cause type checking issues with MyPy.
-async def validate_provider_instance(provider: str, instance_url: str | None = None, auth_token: str | None = None) -> dict[str, any]:
+async def validate_provider_instance(provider: str, instance_url: str | None = None, auth_token: str | None = None) -> dict[str, Any]:📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| async def validate_provider_instance(provider: str, instance_url: str | None = None, auth_token: str | None = None) -> dict[str, any]: | |
| async def validate_provider_instance(provider: str, instance_url: str | None = None, auth_token: str | None = None) -> dict[str, Any]: |
🤖 Prompt for AI Agents
In python/src/server/services/llm_provider_service.py around line 1211, the
return type annotation uses the builtin any instead of the typing Any; change
the annotation from dict[str, any] to dict[str, Any] and ensure you import Any
(e.g. add "from typing import Any" at the top of the file if it's not already
imported) so static type checkers like MyPy recognize the correct type.
Merges OpenRouter embedding provider feature from upstream. Resolves conflict in llm_provider_service.py by keeping auth_token parameter. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Addresses CodeRabbit review: CHAT_MODEL was undefined because MODEL_CHOICE is the actual key used. Now uses helper functions that correctly resolve the displayed model name. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
archon-ui-main/src/components/settings/RAGSettings.tsx (1)
16-20: Remove OpenRouter from EMBEDDING_CAPABLE_PROVIDERS
EMBEDDING_CAPABLE_PROVIDERSincludes'openrouter', but OpenRouter does not provide embedding models—it only acts as a proxy for LLM chat completions. This inclusion allows users to select OpenRouter as an embedding provider in the UI, which the backend cannot serve.Revert to:
-const EMBEDDING_CAPABLE_PROVIDERS: ProviderKey[] = ['openai', 'google', 'openrouter', 'ollama']; +const EMBEDDING_CAPABLE_PROVIDERS: ProviderKey[] = ['openai', 'google', 'ollama'];If OpenRouter embedding support is needed in the future, implement it with a feature flag once backend support is confirmed.
♻️ Duplicate comments (5)
python/src/server/services/llm_provider_service.py (1)
1211-1211: Type hint usesanyinstead ofAny(previously flagged).This issue was already identified in a previous review. The return type annotation uses lowercase
any(the built-in function) instead ofAnyfrom the typing module, which will cause MyPy type checking issues.Apply the previously suggested fix:
-async def validate_provider_instance(provider: str, instance_url: str | None = None, auth_token: str | None = None) -> dict[str, any]: +async def validate_provider_instance(provider: str, instance_url: str | None = None, auth_token: str | None = None) -> dict[str, Any]:Ensure
Anyis imported at the top of the file (it's already imported on line 11).archon-ui-main/src/components/settings/RAGSettings.tsx (4)
142-175: Type Ollama auth/API-mode fields inRAGSettingsPropsand drop(ragSettings as any)castsYou’re now reading/writing
ragSettings.OLLAMA_CHAT_AUTH_TOKEN,ragSettings.OLLAMA_EMBEDDING_AUTH_TOKEN, andragSettings.OLLAMA_API_MODE, but:
- These keys are not declared in
RAGSettingsProps['ragSettings'].- The effects rely on
(ragSettings as any).OLLAMA_*_AUTH_TOKENboth in the body and dependency arrays, which breaks strict typing and defeats the “no implicit any” goal.This is essentially the same issue the previous review already flagged for the auth tokens. Suggest tightening the types and removing the
anycasts:interface RAGSettingsProps { ragSettings: { MODEL_CHOICE: string; CHAT_MODEL?: string; @@ EMBEDDING_MODEL?: string; EMBEDDING_PROVIDER?: string; OLLAMA_EMBEDDING_URL?: string; OLLAMA_EMBEDDING_INSTANCE_NAME?: string; + OLLAMA_CHAT_AUTH_TOKEN?: string; + OLLAMA_EMBEDDING_AUTH_TOKEN?: string; + OLLAMA_API_MODE?: 'native' | 'openai-compatible'; @@ }; setRagSettings: (settings: any) => void; } @@ - const lastLLMConfigRef = useRef({ url: '', name: '', authToken: '' }); - const lastEmbeddingConfigRef = useRef({ url: '', name: '', authToken: '' }); + const lastLLMConfigRef = useRef<{ url: string; name: string; authToken: string }>({ + url: '', + name: '', + authToken: '', + }); + const lastEmbeddingConfigRef = useRef<{ url: string; name: string; authToken: string }>({ + url: '', + name: '', + authToken: '', + }); @@ useEffect(() => { const newLLMUrl = ragSettings.LLM_BASE_URL || ''; const newLLMName = ragSettings.LLM_INSTANCE_NAME || ''; - const newAuthToken = (ragSettings as any).OLLAMA_CHAT_AUTH_TOKEN || ''; + const newAuthToken = ragSettings.OLLAMA_CHAT_AUTH_TOKEN || ''; @@ - }, [ragSettings.LLM_BASE_URL, ragSettings.LLM_INSTANCE_NAME, (ragSettings as any).OLLAMA_CHAT_AUTH_TOKEN]); + }, [ragSettings.LLM_BASE_URL, ragSettings.LLM_INSTANCE_NAME, ragSettings.OLLAMA_CHAT_AUTH_TOKEN]); @@ useEffect(() => { const newEmbeddingUrl = ragSettings.OLLAMA_EMBEDDING_URL || ''; const newEmbeddingName = ragSettings.OLLAMA_EMBEDDING_INSTANCE_NAME || ''; - const newAuthToken = (ragSettings as any).OLLAMA_EMBEDDING_AUTH_TOKEN || ''; + const newAuthToken = ragSettings.OLLAMA_EMBEDDING_AUTH_TOKEN || ''; @@ - }, [ragSettings.OLLAMA_EMBEDDING_URL, ragSettings.OLLAMA_EMBEDDING_INSTANCE_NAME, (ragSettings as any).OLLAMA_EMBEDDING_AUTH_TOKEN]); + }, [ragSettings.OLLAMA_EMBEDDING_URL, ragSettings.OLLAMA_EMBEDDING_INSTANCE_NAME, ragSettings.OLLAMA_EMBEDDING_AUTH_TOKEN]);This keeps the state-sync logic the same but restores type safety and satisfies strict TS in this file.
Also applies to: 225-227, 231-244, 249-272
641-679: Auto-load Ollama models should not special‑case the default URL
loadAvailableModelsitself looks good, but the auto-load effects still skip loading when the URL equals the hardcoded default'http://host.docker.internal:11434/v1'. That means users running a working default Ollama instance won’t see any models until they tweak the URL.You already have
DEFAULT_OLLAMA_URL; I’d drop the special-case and only require a non-empty, trimmed URL:// Auto-load available models when URLs are configured useEffect(() => { - if (llmInstanceConfig.url && llmInstanceConfig.url !== 'http://host.docker.internal:11434/v1') { - loadAvailableModels(llmInstanceConfig.url, 'chat'); + if (llmInstanceConfig.url && llmInstanceConfig.url.trim()) { + void loadAvailableModels(llmInstanceConfig.url, 'chat'); } }, [llmInstanceConfig.url]); useEffect(() => { - if (embeddingInstanceConfig.url && embeddingInstanceConfig.url !== 'http://host.docker.internal:11434/v1') { - loadAvailableModels(embeddingInstanceConfig.url, 'embedding'); + if (embeddingInstanceConfig.url && embeddingInstanceConfig.url.trim()) { + void loadAvailableModels(embeddingInstanceConfig.url, 'embedding'); } }, [embeddingInstanceConfig.url]);This also aligns with the earlier review suggestion to avoid hardcoding that URL.
Also applies to: 682-692
787-807: Keep instance config shape consistent and clear auth fields when deleting instances
llmInstanceConfig/embeddingInstanceConfignow includeuseAuthandauthToken, but some paths still construct configs with{ name, url }only and deletion doesn’t clear the persisted auth fields:
handleDeleteLLMInstanceandhandleDeleteEmbeddingInstancereset state to{ name: '', url: '' }, droppinguseAuthandauthToken.- Deletion only deletes
LLM_BASE_URL,MODEL_CHOICE,OLLAMA_EMBEDDING_URL, andEMBEDDING_MODELfromragSettings; instance names and auth tokens remain.- The “auto-populate embedding instance” block on LLM URL change also only sets
{ name, url }, ignoring auth flags, while you already havesyncEmbeddingFromLLMfor shape-safe syncing.This can lead to inconsistent state shape and “ghost” tokens surviving after delete. Consider:
const handleDeleteLLMInstance = () => { @@ - setLLMInstanceConfig({ - name: '', - url: '' - }); + setLLMInstanceConfig({ + name: '', + url: '', + useAuth: false, + authToken: '' + }); @@ const updatedSettings = { ...ragSettings }; delete updatedSettings.LLM_BASE_URL; delete updatedSettings.MODEL_CHOICE; + delete updatedSettings.LLM_INSTANCE_NAME; + delete updatedSettings.OLLAMA_CHAT_AUTH_TOKEN; @@ const handleDeleteEmbeddingInstance = () => { @@ - setEmbeddingInstanceConfig({ - name: '', - url: '' - }); + setEmbeddingInstanceConfig({ + name: '', + url: '', + useAuth: false, + authToken: '' + }); @@ const updatedSettings = { ...ragSettings }; delete updatedSettings.OLLAMA_EMBEDDING_URL; delete updatedSettings.EMBEDDING_MODEL; + delete updatedSettings.OLLAMA_EMBEDDING_INSTANCE_NAME; + delete updatedSettings.OLLAMA_EMBEDDING_AUTH_TOKEN; @@ - // Auto-populate embedding instance if it's empty (convenience for single-host users) - if (!embeddingInstanceConfig.url || !embeddingInstanceConfig.name) { - setEmbeddingInstanceConfig({ - name: llmInstanceConfig.name || 'Default Ollama', - url: newUrl - }); - } + // Auto-populate embedding instance if it's empty (convenience for single-host users) + if (!embeddingInstanceConfig.url || !embeddingInstanceConfig.name) { + // Reuse shared helper to keep shape (including auth flags) consistent + setEmbeddingInstanceConfig(syncEmbeddingFromLLM({ ...llmInstanceConfig, url: newUrl })); + } @@ - // Sync embedding instance with LLM instance (including auth settings) - setEmbeddingInstanceConfig(syncEmbeddingFromLLM(llmInstanceConfig)); + // Sync embedding instance with LLM instance (including auth settings) + setEmbeddingInstanceConfig(syncEmbeddingFromLLM(llmInstanceConfig));This keeps config state structurally consistent and ensures deleting an instance actually clears its auth settings as well.
[highlight: this overlaps with the earlier review about instance config shape.]Also applies to: 809-829, 2331-2335, 2349-2351
2360-2383: Validate auth tokens before saving whenUse Authenticationis enabledThe LLM and Embedding edit modals now let users toggle “Use Authentication” and enter tokens, and you persist them into
OLLAMA_CHAT_AUTH_TOKEN/OLLAMA_EMBEDDING_AUTH_TOKEN. However, ifuseAuthis true and the token input is empty/whitespace, you still write an empty token toragSettings. This matches an earlier review concern and can lead to confusing failures when the backend attempts authenticated calls with an empty header.Add a small guard before constructing
updatedSettingsin both modals:<Button onClick={async () => { + if (llmInstanceConfig.useAuth && !llmInstanceConfig.authToken.trim()) { + showToast('Authentication token is required when authentication is enabled', 'error'); + return; + } const updatedSettings = { ...ragSettings, LLM_BASE_URL: llmInstanceConfig.url, OLLAMA_CHAT_AUTH_TOKEN: llmInstanceConfig.useAuth ? llmInstanceConfig.authToken : '' }; @@ <Button onClick={async () => { + if (embeddingInstanceConfig.useAuth && !embeddingInstanceConfig.authToken.trim()) { + showToast('Authentication token is required when authentication is enabled', 'error'); + return; + } const updatedSettings = { ...ragSettings, OLLAMA_EMBEDDING_URL: embeddingInstanceConfig.url, OLLAMA_EMBEDDING_AUTH_TOKEN: embeddingInstanceConfig.useAuth ? embeddingInstanceConfig.authToken : '' };This keeps the UX tight and avoids persisting obviously invalid credentials.
Also applies to: 2395-2401, 2485-2491
🧹 Nitpick comments (4)
python/src/server/services/llm_provider_service.py (2)
415-441: Consider extracting token resolution logic to reduce duplication.The token resolution pattern here is duplicated at lines 455-480. Both segments fetch RAG settings, select the appropriate token (embedding vs chat), and apply the same "required-but-ignored" fallback logic.
Consider extracting this into a helper function:
+async def _resolve_ollama_auth_token(use_embedding_provider: bool, instance_type: str | None = None) -> str: + """Resolve Ollama auth token based on operation type.""" + cache_key = "rag_strategy_settings" + rag_settings = _get_cached_settings(cache_key) + if rag_settings is None: + rag_settings = await credential_service.get_credentials_by_category("rag_strategy") + _set_cached_settings(cache_key, rag_settings) + + # Select token based on operation type + if use_embedding_provider or instance_type == "embedding": + token = rag_settings.get("OLLAMA_EMBEDDING_AUTH_TOKEN", "") + logger.info(f"Using OLLAMA_EMBEDDING_AUTH_TOKEN (length: {len(token) if token else 0})") + else: + token = rag_settings.get("OLLAMA_CHAT_AUTH_TOKEN", "") + logger.info(f"Using OLLAMA_CHAT_AUTH_TOKEN (length: {len(token) if token else 0})") + + # Use placeholder if no token is set + if not token: + token = "required-but-ignored" + logger.warning("No Ollama auth token found, using placeholder 'required-but-ignored'") + + return tokenThen simplify both call sites:
- # Check for auth token in RAG settings for fallback - cache_key = "rag_strategy_settings" - rag_settings = _get_cached_settings(cache_key) - if rag_settings is None: - rag_settings = await credential_service.get_credentials_by_category("rag_strategy") - _set_cached_settings(cache_key, rag_settings) - - # Get correct auth token based on operation type - if use_embedding_provider: - ollama_auth_token = rag_settings.get("OLLAMA_EMBEDDING_AUTH_TOKEN", "") - else: - ollama_auth_token = rag_settings.get("OLLAMA_CHAT_AUTH_TOKEN", "") - - # Use "required-but-ignored" as default if no token is set (Ollama doesn't validate when auth is disabled) - if not ollama_auth_token: - ollama_auth_token = "required-but-ignored" + ollama_auth_token = await _resolve_ollama_auth_token(use_embedding_provider)
618-622: Consider consolidating the logging statements.Lines 618 and 621 log very similar information about the embedding URL resolution. This could be simplified to a single log statement for clarity.
embedding_url = rag_settings.get("OLLAMA_EMBEDDING_URL") - logger.info(f"Embedding URL from settings: {embedding_url}") if embedding_url: final_url = embedding_url if embedding_url.endswith('/v1') else f"{embedding_url}/v1" - logger.info(f"Resolved embedding URL: {final_url}") + logger.info(f"Resolved embedding URL: {embedding_url} → {final_url}") return final_urlarchon-ui-main/src/components/settings/RAGSettings.tsx (2)
572-584: Ollama per-instance metrics + available model lists look consistentThe extended
ollamaMetricsshape (per-instance counts for both LLM and embedding plusllmAvailableModels/embeddingAvailableModels) and the corresponding updates infetchOllamaMetricsare internally consistent and line up with how you’re rendering:
- Summary table rows use
embeddingInstanceModelssafely.- The “Available on this instance” row pulls counts by context, which matches the per-instance split.
If you don’t end up wiring
llmAvailableModels/embeddingAvailableModelsinto the selection modals, you could later trim those from state to reduce surface area, but as-is this structure is fine and future‑proof.Also applies to: 885-903
731-740: Toast model counts inmanualTestConnectionmay be stale vs per-call instance statusInside
manualTestConnection, the success toast usesollamaMetrics.llmInstanceModels/embeddingInstanceModelsto derivemodelCount, but those metrics are only refreshed later viafetchOllamaMetrics()and may still be at their initial or previous values. This can yield messages like “0 chat models available” even when the instance has models and the backend returned a model count.Since
instanceStatusfrom the health endpoint already containsmodels_available, consider basing the toast count on that and only using the metrics for more detailed UIs:- // Context-aware model count display - let modelCount = instanceStatus.models_available || 0; - let modelType = 'models'; - - if (context === 'chat') { - modelCount = ollamaMetrics.llmInstanceModels?.chat || 0; - modelType = 'chat models'; - } else if (context === 'embedding') { - modelCount = ollamaMetrics.embeddingInstanceModels?.embedding || 0; - modelType = 'embedding models'; - } + // Context-aware model count display + let modelCount = instanceStatus.models_available || 0; + let modelType = 'models'; + if (context === 'chat') { + modelType = 'chat models'; + } else if (context === 'embedding') { + modelType = 'embedding models'; + } @@ - showToast(`${instanceName} connection successful: ${modelCount} ${modelType} available (${responseTime}ms)`, 'success'); + showToast( + `${instanceName} connection successful: ${modelCount} ${modelType} available (${responseTime}ms)`, + 'success' + ); @@ - // Load available models after successful health check - if (context) { - await loadAvailableModels(url, context); - } + // Load available models after successful health check + if (context) { + await loadAvailableModels(url, context); + }This keeps the metrics fetch for richer views while making the immediate toast reflect the actual response.
Also applies to: 752-755
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
archon-ui-main/src/components/settings/RAGSettings.tsx(17 hunks)python/src/server/services/llm_provider_service.py(5 hunks)
🧰 Additional context used
📓 Path-based instructions (7)
python/src/server/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
python/src/server/**/*.py: Fail fast and loud for service startup failures, missing configuration, database connection failures, authentication/authorization failures, data corruption, critical dependency unavailability, and invalid data that would corrupt state
Never accept corrupted data - skip failed items entirely rather than storing corrupted or incomplete data
Files:
python/src/server/services/llm_provider_service.py
python/src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
python/src/**/*.py: Complete batch processing, background tasks, and external API calls but log detailed errors for each failure instead of crashing
Include context about what was being attempted, preserve full stack traces with exc_info=True in Python logging, use specific exception types, include relevant IDs/URLs, never return None to indicate failure, and report both success count and detailed failure list for batch operations
Use Ruff for linting and MyPy for type checking with 120 character line length in Python backend
Use Python 3.12 with 120 character line length in backend code
Import GITHUB_REPO_OWNER and GITHUB_REPO_NAME constants from python/src/server/config/version.py as the single source of truth for repository configuration
python/src/**/*.py: Python backend code should use 120 character line length
Use Ruff for linting Python code to check for errors, warnings, and unused imports
Use Mypy for type checking in Python to ensure type safety
Files:
python/src/server/services/llm_provider_service.py
python/src/server/services/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
python/src/server/services/**/*.py: For batch processing and background tasks, complete what you can and report detailed failures for each item rather than crashing
Never accept corrupted data - skip failed items entirely rather than storing corrupted data in batch operations
Files:
python/src/server/services/llm_provider_service.py
python/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Error messages must include context about what was being attempted, preserve full stack traces, use specific exception types, include relevant IDs/URLs, and report both success count and detailed failure list for batch operations
Files:
python/src/server/services/llm_provider_service.py
archon-ui-main/src/**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
archon-ui-main/src/**/*.{ts,tsx}: Use TypeScript in strict mode with no implicit any in frontend code
Do not use dynamic Tailwind class construction in frontend components
Files:
archon-ui-main/src/components/settings/RAGSettings.tsx
archon-ui-main/src/!(features)/**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
Use ESLint with standard React rules for legacy code in the frontend
Files:
archon-ui-main/src/components/settings/RAGSettings.tsx
archon-ui-main/src/**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (AGENTS.md)
Use ESLint for legacy frontend code outside /src/features directory with standard React rules
Files:
archon-ui-main/src/components/settings/RAGSettings.tsx
🧠 Learnings (3)
📚 Learning: 2025-11-29T19:24:36.223Z
Learnt from: CR
Repo: coleam00/Archon PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-29T19:24:36.223Z
Learning: Applies to archon-ui-main/src/**/*.{ts,tsx} : Use TypeScript in strict mode with no implicit any in frontend code
Applied to files:
archon-ui-main/src/components/settings/RAGSettings.tsx
📚 Learning: 2025-11-29T19:25:23.987Z
Learnt from: CR
Repo: coleam00/Archon PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-29T19:25:23.987Z
Learning: Applies to archon-ui-main/src/features/**/*.{ts,tsx} : TypeScript strict mode enabled with no implicit any, and use Biome with 120 character lines and double quotes in /src/features directory
Applied to files:
archon-ui-main/src/components/settings/RAGSettings.tsx
📚 Learning: 2025-09-25T16:36:48.527Z
Learnt from: Chillbruhhh
Repo: coleam00/Archon PR: 736
File: archon-ui-main/src/components/settings/RAGSettings.tsx:17-19
Timestamp: 2025-09-25T16:36:48.527Z
Learning: OpenRouter does not provide embedding models - it only acts as a proxy/router for various LLM providers for chat completions, not embeddings. The EMBEDDING_CAPABLE_PROVIDERS array in RAGSettings.tsx correctly excludes 'openrouter'.
Applied to files:
archon-ui-main/src/components/settings/RAGSettings.tsx
🧬 Code graph analysis (2)
python/src/server/services/llm_provider_service.py (3)
python/src/server/services/credential_service.py (1)
get_credentials_by_category(323-369)python/src/server/api_routes/settings_api.py (1)
get_credentials_by_category(78-93)python/src/server/services/ollama/model_discovery_service.py (1)
check_instance_health(964-1024)
archon-ui-main/src/components/settings/RAGSettings.tsx (2)
archon-ui-main/src/components/settings/utils/instanceConfigSync.ts (1)
syncEmbeddingFromLLM(18-28)archon-ui-main/src/features/ui/primitives/input.tsx (1)
Input(8-29)
🔇 Additional comments (4)
python/src/server/services/llm_provider_service.py (2)
455-480: Token resolution logic is correct.The token selection correctly handles both
use_embedding_providerandinstance_type == "embedding"cases, ensuring the appropriate token is used for each operation type. The fallback to "required-but-ignored" is reasonable for Ollama instances without authentication enabled.Note: The logging on lines 465 and 468 reveals which specific token type (embedding vs chat) is being used. While this doesn't expose the token value itself, consider whether this information disclosure is acceptable in your security model.
1218-1218: LGTM! Authentication token parameter properly documented and propagated.The
auth_tokenparameter is correctly documented in the function docstring and properly passed through to thecheck_instance_healthservice method on line 1235.Also applies to: 1235-1235
archon-ui-main/src/components/settings/RAGSettings.tsx (2)
1594-1653: API mode UI is solid; just ensureOLLAMA_API_MODEis typed and persisted consistentlyThe “Ollama API Mode” toggle is a nice UX touch and correctly treats
undefinedas “native” by default. With the earlier type additions toRAGSettingsProps, this section becomes type-safe.Nothing to change here beyond making sure
OLLAMA_API_MODE?: 'native' | 'openai-compatible'is added to theragSettingstype (see previous comment) so these reads/writes don’t rely on structural typing.
1846-1873: Authentication and selected-model rows in the summary table are well designedThe new “Authentication” row correctly displays only a boolean status (“Token configured” vs “No authentication”) and avoids leaking token contents. Likewise, the “Selected Model” row now gracefully falls back to a “Not selected” placeholder for both chat and embeddings, which will help users debug misconfigurations.
No code changes needed here.
Also applies to: 1890-1899
- Add useAuth/authToken to instance config resets (RAGSettings.tsx) - Clear auth tokens from ragSettings when deleting instances - Fix type hint: dict[str, any] -> dict[str, Any] (llm_provider_service.py) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
- ArchonChatPanel: Remove BookOpen, Search - BugReportModal: Remove ExternalLink - ErrorBoundaryWithBugReport: Remove React (use named imports) - CodeViewerModal: Remove FileText - DisconnectScreenOverlay: Remove Wifi, WifiOff - APIKeysSection: Remove Key, Input 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (3)
archon-ui-main/src/components/settings/RAGSettings.tsx (3)
142-176: Add Ollama auth/API mode fields to RAGSettingsProps and removeas anycastsThe component now reads/writes
OLLAMA_CHAT_AUTH_TOKEN,OLLAMA_EMBEDDING_AUTH_TOKEN, andOLLAMA_API_MODEviaragSettings, but these keys are not declared onRAGSettingsProps["ragSettings"]. Accessing them via(ragSettings as any)both weakens type safety (violating strict mode) and makes it easy to desync UI from persisted settings. This also meansragSettings.OLLAMA_API_MODEis currently a type error.You can fix this by extending the interface and wiring the effects to the typed fields, eliminating the
as anyusage:interface RAGSettingsProps { ragSettings: { MODEL_CHOICE: string; - CHAT_MODEL?: string; + CHAT_MODEL?: string; USE_CONTEXTUAL_EMBEDDINGS: boolean; @@ EMBEDDING_MODEL?: string; EMBEDDING_PROVIDER?: string; OLLAMA_EMBEDDING_URL?: string; OLLAMA_EMBEDDING_INSTANCE_NAME?: string; + OLLAMA_CHAT_AUTH_TOKEN?: string; + OLLAMA_EMBEDDING_AUTH_TOKEN?: string; + OLLAMA_API_MODE?: 'native' | 'openai-compatible'; @@ }; setRagSettings: (settings: any) => void; } @@ useEffect(() => { const newLLMUrl = ragSettings.LLM_BASE_URL || ''; const newLLMName = ragSettings.LLM_INSTANCE_NAME || ''; - const newAuthToken = (ragSettings as any).OLLAMA_CHAT_AUTH_TOKEN || ''; + const newAuthToken = ragSettings.OLLAMA_CHAT_AUTH_TOKEN || ''; @@ - }, [ragSettings.LLM_BASE_URL, ragSettings.LLM_INSTANCE_NAME, (ragSettings as any).OLLAMA_CHAT_AUTH_TOKEN]); + }, [ragSettings.LLM_BASE_URL, ragSettings.LLM_INSTANCE_NAME, ragSettings.OLLAMA_CHAT_AUTH_TOKEN]); @@ useEffect(() => { const newEmbeddingUrl = ragSettings.OLLAMA_EMBEDDING_URL || ''; const newEmbeddingName = ragSettings.OLLAMA_EMBEDDING_INSTANCE_NAME || ''; - const newAuthToken = (ragSettings as any).OLLAMA_EMBEDDING_AUTH_TOKEN || ''; + const newAuthToken = ragSettings.OLLAMA_EMBEDDING_AUTH_TOKEN || ''; @@ - }, [ragSettings.OLLAMA_EMBEDDING_URL, ragSettings.OLLAMA_EMBEDDING_INSTANCE_NAME, (ragSettings as any).OLLAMA_EMBEDDING_AUTH_TOKEN]); + }, [ragSettings.OLLAMA_EMBEDDING_URL, ragSettings.OLLAMA_EMBEDDING_INSTANCE_NAME, ragSettings.OLLAMA_EMBEDDING_AUTH_TOKEN]);This keeps the props contract honest and satisfies strict TypeScript without resorting to
any.Also applies to: 223-227, 228-249, 251-272, 1600-1659, 2404-2410, 2493-2499
641-693: Auto‑loading available models still skips the default Ollama URL
loadAvailableModelsis a nice addition and the metrics wiring (llmAvailableModels/embeddingAvailableModels) looks good, but the auto‑load effects still short‑circuit when the URL equals the defaulthttp://host.docker.internal:11434/v1. This means users running Ollama on the default host won’t see model lists until they manually tweak the URL or hit “Test Connection”.You can address this and simplify the condition by loading models whenever a non‑empty URL is present:
// Auto-load available models when URLs are configured useEffect(() => { - if (llmInstanceConfig.url && llmInstanceConfig.url !== 'http://host.docker.internal:11434/v1') { - loadAvailableModels(llmInstanceConfig.url, 'chat'); - } + if (llmInstanceConfig.url && llmInstanceConfig.url.trim()) { + void loadAvailableModels(llmInstanceConfig.url, 'chat'); + } }, [llmInstanceConfig.url]); useEffect(() => { - if (embeddingInstanceConfig.url && embeddingInstanceConfig.url !== 'http://host.docker.internal:11434/v1') { - loadAvailableModels(embeddingInstanceConfig.url, 'embedding'); - } + if (embeddingInstanceConfig.url && embeddingInstanceConfig.url.trim()) { + void loadAvailableModels(embeddingInstanceConfig.url, 'embedding'); + } }, [embeddingInstanceConfig.url]);This preserves the debounce behavior (hook still keys off URL changes) and makes the default local instance behave like any other.
Also applies to: 752-755, 572-584, 891-909
2368-2391: Require non‑empty auth tokens when authentication is enabled for LLM/embedding instancesThe new auth controls in both edit modals correctly wire
useAuth/authTokeninto local state and persistOLLAMA_CHAT_AUTH_TOKEN/OLLAMA_EMBEDDING_AUTH_TOKEN. However, if a user enables “Use Authentication” but leaves the token empty, the code still proceeds to “save” and pushes an empty string intoragSettings(effectively disabling auth) without any feedback.This is the same edge case highlighted in the prior review; adding lightweight validation before constructing
updatedSettingswill prevent confusing failures:// LLM instance modal save - <Button - onClick={async () => { - const updatedSettings = { + <Button + onClick={async () => { + if (llmInstanceConfig.useAuth && !llmInstanceConfig.authToken.trim()) { + showToast('Authentication token is required when authentication is enabled', 'error'); + return; + } + const updatedSettings = { ...ragSettings, LLM_BASE_URL: llmInstanceConfig.url, OLLAMA_CHAT_AUTH_TOKEN: llmInstanceConfig.useAuth ? llmInstanceConfig.authToken : '' }; setRagSettings(updatedSettings); @@ // Embedding instance modal save - <Button - onClick={async () => { - const updatedSettings = { + <Button + onClick={async () => { + if (embeddingInstanceConfig.useAuth && !embeddingInstanceConfig.authToken.trim()) { + showToast('Authentication token is required when authentication is enabled', 'error'); + return; + } + const updatedSettings = { ...ragSettings, OLLAMA_EMBEDDING_URL: embeddingInstanceConfig.url, OLLAMA_EMBEDDING_AUTH_TOKEN: embeddingInstanceConfig.useAuth ? embeddingInstanceConfig.authToken : '' }; setRagSettings(updatedSettings);That keeps the UX aligned with the toggles: “Use Authentication” can’t be turned on with an empty token.
Also applies to: 2403-2432, 2458-2481, 2493-2521
🧹 Nitpick comments (4)
archon-ui-main/src/components/bug-report/BugReportModal.tsx (1)
230-235: Consider removing theas anytype assertion.Line 233 uses
as anywhich bypasses TypeScript's type checking. Theseverityfield should have a proper union type matching the dropdown options.As per coding guidelines, TypeScript strict mode with no implicit
anyshould be enforced.You can define a proper type for severity and use it instead:
type BugSeverity = "low" | "medium" | "high" | "critical";Then update the cast to:
onChange={(e) => setReport((r) => ({ ...r, - severity: e.target.value as any, + severity: e.target.value as BugSeverity, })) }python/src/server/services/llm_provider_service.py (2)
399-445: Ollama auth token resolution and fallback behavior look consistentThe new logic to:
- Resolve
OLLAMA_CHAT_AUTH_TOKEN/OLLAMA_EMBEDDING_AUTH_TOKENfrom cachedrag_strategysettings,- Fall back to a
"required-but-ignored"placeholder when no token is configured, and- Use that value as
api_keyfor both the OpenAI→Ollama fallback and direct"ollama"provider paths,is coherent and avoids logging the actual token (only lengths are logged). This should work whether auth is disabled (placeholder ignored by Ollama) or enabled (real token present).
If you find this pattern spreading, consider extracting a small helper like
get_ollama_auth_token(use_embedding_provider: bool, rag_settings: dict) -> strto keep the token selection and placeholder rules in one place, but it’s not strictly necessary right now.Also applies to: 447-481
1211-1235: validate_provider_instance auth_token plumbing is correct; consider richer error logsExtending
validate_provider_instanceto acceptauth_tokenand forwarding it intomodel_discovery_service.check_instance_health(instance_url, auth_token=auth_token)correctly enables per‑instance authenticated health checks for Ollama. The return shape and non‑Ollama branches are preserved, and the return type now usesAny(fixing the earlieranytypo).To better align with the backend logging guidelines, consider adding
exc_info=Trueto the error log in theexceptblock so health‑check failures preserve full stack traces:- except Exception as e: - logger.error(f"Error validating provider {provider}: {e}") + except Exception as e: + logger.error(f"Error validating provider {provider}: {e}", exc_info=True)Also applies to: 1275-1285
archon-ui-main/src/components/settings/RAGSettings.tsx (1)
1852-1879: Align “Authentication” summary row withuseAuthstate, not just token presenceThe “Authentication” row in the Ollama configuration summary currently reports “Token configured” purely based on whether
authTokenis non‑empty, ignoringuseAuth. Because you intentionally keep the token around in local state even whenuseAuthis toggled off, this can show “Token configured” while authentication is actually disabled.You can make the summary reflect the real effective state by incorporating
useAuth:- <tr> - <td className="py-2 text-gray-400">Authentication</td> - <td className="py-2"> - {activeSelection === 'chat' ? ( - llmInstanceConfig.authToken ? ( + <tr> + <td className="py-2 text-gray-400">Authentication</td> + <td className="py-2"> + {activeSelection === 'chat' ? ( + llmInstanceConfig.useAuth && llmInstanceConfig.authToken ? ( <span className="text-teal-400 flex items-center"> @@ ) : ( <span className="text-gray-500 italic">No authentication</span> ) ) : ( - embeddingInstanceConfig.authToken ? ( + embeddingInstanceConfig.useAuth && embeddingInstanceConfig.authToken ? ( <span className="text-teal-400 flex items-center"> @@ ) : ( <span className="text-gray-500 italic">No authentication</span> ) )} </td> </tr>This keeps “Token configured” in sync with the toggle while still letting you cache the token client‑side for convenience.
Also applies to: 2368-2390, 2458-2480
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
archon-ui-main/src/components/DisconnectScreenOverlay.tsx(1 hunks)archon-ui-main/src/components/agent-chat/ArchonChatPanel.tsx(1 hunks)archon-ui-main/src/components/bug-report/BugReportModal.tsx(1 hunks)archon-ui-main/src/components/bug-report/ErrorBoundaryWithBugReport.tsx(1 hunks)archon-ui-main/src/components/code/CodeViewerModal.tsx(0 hunks)archon-ui-main/src/components/settings/APIKeysSection.tsx(1 hunks)archon-ui-main/src/components/settings/RAGSettings.tsx(20 hunks)python/src/server/services/llm_provider_service.py(5 hunks)
💤 Files with no reviewable changes (1)
- archon-ui-main/src/components/code/CodeViewerModal.tsx
🧰 Additional context used
📓 Path-based instructions (7)
archon-ui-main/src/**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
archon-ui-main/src/**/*.{ts,tsx}: Use TypeScript in strict mode with no implicit any in frontend code
Do not use dynamic Tailwind class construction in frontend components
Files:
archon-ui-main/src/components/bug-report/BugReportModal.tsxarchon-ui-main/src/components/settings/APIKeysSection.tsxarchon-ui-main/src/components/bug-report/ErrorBoundaryWithBugReport.tsxarchon-ui-main/src/components/DisconnectScreenOverlay.tsxarchon-ui-main/src/components/agent-chat/ArchonChatPanel.tsxarchon-ui-main/src/components/settings/RAGSettings.tsx
archon-ui-main/src/!(features)/**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
Use ESLint with standard React rules for legacy code in the frontend
Files:
archon-ui-main/src/components/bug-report/BugReportModal.tsxarchon-ui-main/src/components/settings/APIKeysSection.tsxarchon-ui-main/src/components/bug-report/ErrorBoundaryWithBugReport.tsxarchon-ui-main/src/components/DisconnectScreenOverlay.tsxarchon-ui-main/src/components/agent-chat/ArchonChatPanel.tsxarchon-ui-main/src/components/settings/RAGSettings.tsx
archon-ui-main/src/**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (AGENTS.md)
Use ESLint for legacy frontend code outside /src/features directory with standard React rules
Files:
archon-ui-main/src/components/bug-report/BugReportModal.tsxarchon-ui-main/src/components/settings/APIKeysSection.tsxarchon-ui-main/src/components/bug-report/ErrorBoundaryWithBugReport.tsxarchon-ui-main/src/components/DisconnectScreenOverlay.tsxarchon-ui-main/src/components/agent-chat/ArchonChatPanel.tsxarchon-ui-main/src/components/settings/RAGSettings.tsx
python/src/server/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
python/src/server/**/*.py: Fail fast and loud for service startup failures, missing configuration, database connection failures, authentication/authorization failures, data corruption, critical dependency unavailability, and invalid data that would corrupt state
Never accept corrupted data - skip failed items entirely rather than storing corrupted or incomplete data
Files:
python/src/server/services/llm_provider_service.py
python/src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
python/src/**/*.py: Complete batch processing, background tasks, and external API calls but log detailed errors for each failure instead of crashing
Include context about what was being attempted, preserve full stack traces with exc_info=True in Python logging, use specific exception types, include relevant IDs/URLs, never return None to indicate failure, and report both success count and detailed failure list for batch operations
Use Ruff for linting and MyPy for type checking with 120 character line length in Python backend
Use Python 3.12 with 120 character line length in backend code
Import GITHUB_REPO_OWNER and GITHUB_REPO_NAME constants from python/src/server/config/version.py as the single source of truth for repository configuration
python/src/**/*.py: Python backend code should use 120 character line length
Use Ruff for linting Python code to check for errors, warnings, and unused imports
Use Mypy for type checking in Python to ensure type safety
Files:
python/src/server/services/llm_provider_service.py
python/src/server/services/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
python/src/server/services/**/*.py: For batch processing and background tasks, complete what you can and report detailed failures for each item rather than crashing
Never accept corrupted data - skip failed items entirely rather than storing corrupted data in batch operations
Files:
python/src/server/services/llm_provider_service.py
python/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Error messages must include context about what was being attempted, preserve full stack traces, use specific exception types, include relevant IDs/URLs, and report both success count and detailed failure list for batch operations
Files:
python/src/server/services/llm_provider_service.py
🧠 Learnings (12)
📚 Learning: 2025-11-29T19:25:23.987Z
Learnt from: CR
Repo: coleam00/Archon PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-29T19:25:23.987Z
Learning: Applies to archon-ui-main/src/features/**/*.{ts,tsx} : TypeScript strict mode enabled with no implicit any, and use Biome with 120 character lines and double quotes in /src/features directory
Applied to files:
archon-ui-main/src/components/bug-report/BugReportModal.tsxarchon-ui-main/src/components/settings/APIKeysSection.tsxarchon-ui-main/src/components/bug-report/ErrorBoundaryWithBugReport.tsxarchon-ui-main/src/components/agent-chat/ArchonChatPanel.tsxarchon-ui-main/src/components/settings/RAGSettings.tsx
📚 Learning: 2025-11-29T19:25:23.987Z
Learnt from: CR
Repo: coleam00/Archon PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-29T19:25:23.987Z
Learning: Applies to archon-ui-main/src/features/**/components/**/*.{ts,tsx} : Use Radix UI primitives from src/features/ui/primitives/ for new UI components in features directory
Applied to files:
archon-ui-main/src/components/bug-report/BugReportModal.tsxarchon-ui-main/src/components/settings/APIKeysSection.tsxarchon-ui-main/src/components/bug-report/ErrorBoundaryWithBugReport.tsxarchon-ui-main/src/components/agent-chat/ArchonChatPanel.tsx
📚 Learning: 2025-11-29T19:24:36.223Z
Learnt from: CR
Repo: coleam00/Archon PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-29T19:24:36.223Z
Learning: Applies to archon-ui-main/src/features/**/components/**/*.{ts,tsx} : Prefer using Radix UI primitives from src/features/ui/primitives/ when creating UI components
Applied to files:
archon-ui-main/src/components/bug-report/BugReportModal.tsxarchon-ui-main/src/components/settings/APIKeysSection.tsxarchon-ui-main/src/components/bug-report/ErrorBoundaryWithBugReport.tsxarchon-ui-main/src/components/agent-chat/ArchonChatPanel.tsx
📚 Learning: 2025-11-29T19:24:36.223Z
Learnt from: CR
Repo: coleam00/Archon PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-29T19:24:36.223Z
Learning: Applies to archon-ui-main/src/!(features)/**/*.{ts,tsx} : Use ESLint with standard React rules for legacy code in the frontend
Applied to files:
archon-ui-main/src/components/bug-report/BugReportModal.tsxarchon-ui-main/src/components/bug-report/ErrorBoundaryWithBugReport.tsx
📚 Learning: 2025-11-29T19:24:36.223Z
Learnt from: CR
Repo: coleam00/Archon PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-29T19:24:36.223Z
Learning: Applies to archon-ui-main/src/**/*.{ts,tsx} : Use TypeScript in strict mode with no implicit any in frontend code
Applied to files:
archon-ui-main/src/components/bug-report/BugReportModal.tsxarchon-ui-main/src/components/settings/APIKeysSection.tsxarchon-ui-main/src/components/bug-report/ErrorBoundaryWithBugReport.tsxarchon-ui-main/src/components/settings/RAGSettings.tsx
📚 Learning: 2025-11-29T19:25:23.987Z
Learnt from: CR
Repo: coleam00/Archon PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-29T19:25:23.987Z
Learning: Applies to archon-ui-main/src/**/*.{ts,tsx,js,jsx} : Use ESLint for legacy frontend code outside /src/features directory with standard React rules
Applied to files:
archon-ui-main/src/components/bug-report/BugReportModal.tsxarchon-ui-main/src/components/bug-report/ErrorBoundaryWithBugReport.tsx
📚 Learning: 2025-11-29T19:24:36.223Z
Learnt from: CR
Repo: coleam00/Archon PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-29T19:24:36.223Z
Learning: Applies to archon-ui-main/src/features/**/*.{ts,tsx} : Apply Tron-inspired glassmorphism styling with Tailwind and follow responsive design patterns with mobile-first approach
Applied to files:
archon-ui-main/src/components/bug-report/BugReportModal.tsx
📚 Learning: 2025-09-04T16:30:05.227Z
Learnt from: stevepresley
Repo: coleam00/Archon PR: 573
File: archon-ui-main/src/config/api.ts:15-25
Timestamp: 2025-09-04T16:30:05.227Z
Learning: Archon UI API config: Prefer lazy getters getApiFullUrl() and getWsUrl() over module-load constants to avoid SSR/test crashes. Avoid CommonJS exports patterns (Object.defineProperty(exports,…)) in ESM. Add typeof window guards with VITE_API_URL fallback inside getApiUrl()/getWebSocketUrl() when SSR safety is required.
Applied to files:
archon-ui-main/src/components/settings/APIKeysSection.tsx
📚 Learning: 2025-11-29T19:25:23.987Z
Learnt from: CR
Repo: coleam00/Archon PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-29T19:25:23.987Z
Learning: Applies to archon-ui-main/src/features/**/components/**/*.{ts,tsx} : Apply Tron-inspired glassmorphism styling with Tailwind in features directory UI components
Applied to files:
archon-ui-main/src/components/settings/APIKeysSection.tsx
📚 Learning: 2025-08-28T12:03:56.388Z
Learnt from: Wirasm
Repo: coleam00/Archon PR: 514
File: archon-ui-main/src/components/layouts/ArchonChatPanel.tsx:71-81
Timestamp: 2025-08-28T12:03:56.388Z
Learning: The agent chat service is disabled in the Archon codebase, so optimizations or fixes related to agent chat functionality may not be necessary to implement.
Applied to files:
archon-ui-main/src/components/agent-chat/ArchonChatPanel.tsx
📚 Learning: 2025-11-29T19:24:36.223Z
Learnt from: CR
Repo: coleam00/Archon PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-29T19:24:36.223Z
Learning: Applies to archon-ui-main/src/features/**/*.{ts,tsx} : Use TanStack Query for all data fetching with smart HTTP polling where appropriate
Applied to files:
archon-ui-main/src/components/agent-chat/ArchonChatPanel.tsx
📚 Learning: 2025-09-25T16:36:48.527Z
Learnt from: Chillbruhhh
Repo: coleam00/Archon PR: 736
File: archon-ui-main/src/components/settings/RAGSettings.tsx:17-19
Timestamp: 2025-09-25T16:36:48.527Z
Learning: OpenRouter does not provide embedding models - it only acts as a proxy/router for various LLM providers for chat completions, not embeddings. The EMBEDDING_CAPABLE_PROVIDERS array in RAGSettings.tsx correctly excludes 'openrouter'.
Applied to files:
archon-ui-main/src/components/settings/RAGSettings.tsx
🧬 Code graph analysis (1)
archon-ui-main/src/components/settings/RAGSettings.tsx (1)
archon-ui-main/src/components/settings/utils/instanceConfigSync.ts (1)
syncEmbeddingFromLLM(18-28)
🔇 Additional comments (7)
archon-ui-main/src/components/agent-chat/ArchonChatPanel.tsx (1)
2-2: LGTM! Clean removal of unused imports.The
BookOpenandSearchicons are not referenced anywhere in the component, so removing them from the import statement is appropriate and improves code cleanliness.archon-ui-main/src/components/DisconnectScreenOverlay.tsx (1)
2-2: LGTM! Cleanup of unused imports.The removal of
WifiandWifiOffimports is correct since only theXicon is used in the component (line 40). This cleanup improves code maintainability.archon-ui-main/src/components/bug-report/BugReportModal.tsx (1)
3-3: LGTM! Clean removal of unused import.The
ExternalLinkicon was not referenced anywhere in the component, so removing it from the import is appropriate cleanup.archon-ui-main/src/components/bug-report/ErrorBoundaryWithBugReport.tsx (1)
1-1: LGTM! Modern import pattern aligns with new JSX transform.The refactoring to named imports is correct and aligns with the new JSX runtime. All necessary imports (Component, ErrorInfo, ReactNode) are present and properly used throughout the file.
archon-ui-main/src/components/settings/APIKeysSection.tsx (1)
2-2: Import surface reduction looks goodRemoving unused icons from the
lucide-reactimport keeps the bundle lean and avoids unused‑symbol lint noise. No behavioral changes introduced.python/src/server/services/llm_provider_service.py (1)
612-622: Embedding URL logging is helpful and safeThe additional logging of
OLLAMA_EMBEDDING_URLand the resolved/v1URL in_get_optimal_ollama_instanceimproves observability for routing issues without exposing secrets. Behavior for chat vs embedding URLs and the fallback toLLM_BASE_URLremains unchanged.archon-ui-main/src/components/settings/RAGSettings.tsx (1)
1600-1659: Ollama API mode toggle and instance auth state wiring look solidThe new “Ollama API Mode” radio group correctly writes
OLLAMA_API_MODEas either'native'or'openai-compatible'intoragSettings, and the top‑level “Save Settings” path will persist it viacredentialsService.updateRagSettings(updatedSettings).Similarly, extending
llmInstanceConfig/embeddingInstanceConfigwithuseAuthandauthToken, ensuring deletion resets include those fields, and usingsyncEmbeddingFromLLM(llmInstanceConfig)when “Use same host for embedding instance” is checked all keep the instance configuration shape consistent and address the earlier state‑shape/reset concerns.No changes needed here beyond the type additions already suggested.
Also applies to: 210-221, 217-221, 2357-2358, 787-821
- Add pymupdf4llm>=0.2.5 as primary PDF extraction library - Produces clean Markdown with proper word separation - Falls back to pdfplumber, then PyPDF2 if needed - Validated: pymupdf4llm extracts "Since 1996, Bloomberg" vs pdfplumber's "Since1996,Bloomberg" (no spaces) Phase0 validated, Phase1 complete. Phase2 (OCR) pending. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
…de block protection - Add heading boundary (\n#) as highest priority breakpoint - Protect code blocks from being split mid-block - Add is_inside_code_block() helper to check if position is within ```...``` - Only break at paragraphs/sentences when outside code blocks This improves RAG quality for Markdown documents (especially from pymupdf4llm) by preserving semantic structure during chunking. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
…dels BREAKING: Search now supports all embedding dimensions (384, 768, 1024, 1536, 3072) Previously search was hardcoded to use 1536-dim functions, which meant: - snowflake-arctic-embed2 (1024-dim) → NO RESULTS - nomic-embed-text (768-dim) → NO RESULTS - Only OpenAI text-embedding-3-small (1536-dim) worked Fix: - Detect dimension from query_embedding: len(query_embedding) - Use _multi RPC functions with embedding_dimension parameter - base_search_strategy.py: match_archon_*_multi - hybrid_search_strategy.py: hybrid_search_*_multi 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Adds automatic OCR extraction for scanned/image-based PDFs that don't contain extractable text layers. New files: - ocr_processing.py: Tesseract OCR engine wrapper - extract_text_with_ocr(): Convert PDF pages to images, run OCR - is_ocr_available(): Check for pytesseract/pdf2image - get_supported_languages(): List installed Tesseract languages Changes: - document_processing.py: OCR fallback when text extraction fails - pyproject.toml: Add pytesseract>=0.3.10, pdf2image>=1.17.0 Flow: pymupdf4llm → pdfplumber → PyPDF2 → Tesseract OCR Requires system dependencies: - macOS: brew install tesseract poppler - Linux: apt install tesseract-ocr poppler-utils 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
|
@Milofax, We already supported Ollama and have all the capabilities listed in this PR except for the Ollama API selection capability but don't see a need for this as we work with the native Ollama method and I've never heard of someone using Ollama via the OpenAI specification when the native one is available. |
|
Hey @tazmon95, thanks for the feedback! Let me provide some context on why this PR exists. To be honest, I wouldn't have touched the Ollama implementation at all if it had worked for my use case. The changes
If the toggle adds unnecessary complexity and nobody uses the OpenAI-compatible mode for Ollama, I'm totally fine with Happy to discuss or simplify where it makes sense! 🙂 |
|
Thanks @Milofax ,
|
…issues Phase 3 Summary Feature: - Beautiful View with react-markdown + MermaidRenderer - Edit View with textarea + live preview - Backend summary update support - Regenerate warning dialog - E2E tests (14/14 passed) CodeRabbit Fixes (PR coleam00#875): - Fix authToken persistence in credentialsService - Remove unnecessary 'as any' casts in RAGSettings - Add authToken to OllamaInstance load/save 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
…stence - Add authToken persistence to setOllamaInstances in credentialsService.ts - Remove 'as any' type casts for OLLAMA_*_AUTH_TOKEN in RAGSettings.tsx - Types are already defined in RAGSettings interface (lines 171-172) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (8)
python/src/server/services/storage/base_storage_service.py (1)
78-80: Move helper function outside the loop to avoid repeated redefinition.
is_inside_code_blockis redefined on every iteration of thewhileloop (line 64), which is inefficient. Since it doesn't capture any loop-local state, extract it to module level or as a static/class method.+def _is_inside_code_block(text_segment: str, pos: int) -> bool: + """Check if position is inside a code block.""" + return text_segment[:pos].count("```") % 2 == 1 + + class BaseStorageService(ABC):Then remove the nested definition and reference
_is_inside_code_blockdirectly insidesmart_chunk_text.python/src/server/utils/document_processing.py (3)
253-254: Consider moving imports to module level.
tempfileandosare standard library modules with negligible import overhead. Moving them to the top of the file improves clarity and avoids repeated import resolution on each function call.import io +import os +import tempfile # Removed direct logging import - using unified configThen remove the inline imports at lines 253-254.
264-268: Extract minimum text thresholds to named constants.The magic numbers 100 (lines 264, 289) and 50 (line 325) control fallback behavior but their purpose isn't immediately clear. Named constants would improve maintainability and make tuning easier.
# At module level MIN_TEXT_CHARS_PRIMARY = 100 # Minimum chars for pymupdf4llm/pdfplumber success MIN_TEXT_CHARS_OCR = 50 # Lower threshold for OCR (typically yields less text)
265-273: Inconsistent logging betweenloggerandlogfire.Success messages use
logger.info()(line 265) while failure/warning messages uselogfire.warning()(lines 268, 273). This pattern continues throughout the function. Consider using one consistently for warnings/fallbacks to simplify log aggregation—typicallylogger.warning()for operational messages andlogfirefor structured observability events.python/src/server/utils/ocr_processing.py (3)
109-111: Includeexc_info=Truefor exception logging.Per coding guidelines, exception handlers should preserve full stack traces. This aids debugging when OCR fails on specific pages.
except Exception as e: - logger.warning(f"OCR failed on page {page_num}: {e}") + logger.warning(f"OCR failed on page {page_num}: {e}", exc_info=True) continueApply the same pattern at lines 122-124 and 141-143.
122-124: Add stack trace to the error log.The exception is logged but without the full traceback. For external API calls (Tesseract subprocess), detailed traces help diagnose issues like missing binaries or corrupted PDFs.
except Exception as e: - logfire.error(f"OCR extraction failed: {e}") + logfire.error(f"OCR extraction failed: {e}", exc_info=True) return None
93-111: Consider validating language availability upfront.If an invalid
languageis passed, Tesseract will fail on every page. While the per-page exception handler prevents crashes, it's inefficient. An upfront check could provide a clearer error:if language not in get_supported_languages(): logger.warning(f"Language '{language}' may not be installed, falling back to 'eng'") language = "eng"This is optional since the current behavior is still correct.
python/src/server/services/search/hybrid_search_strategy.py (1)
143-163: Copy filter dict before mutation and align error logging in code example hybrid searchThe multi‑dimensional code example hybrid search wiring looks consistent with the documents path. Two small tweaks will improve safety and observability:
- Avoid mutating
filter_metadataSame pattern as above:
filter_json = filter_metadata or {}followed byfilter_json.pop("source")mutates the caller’s dict. Copying is cheap and avoids surprises:- filter_json = filter_metadata or {} + filter_json = dict(filter_metadata or {}) @@ - if not final_source_filter and "source" in filter_json: - final_source_filter = filter_json.pop("source") + if not final_source_filter and "source" in filter_json: + final_source_filter = filter_json.pop("source")(Here only the first line changes; the
popstill acts on a local copy.)
- Upgrade error logging to capture stack traces and context
In the
exceptat the end of this method, mirroring the pattern suggested forvector_search(usingexc_info=Trueand includingmatch_count/source_idorfinal_source_filterin the log message) would bring this in line with the stated logging guidelines and make Supabase/DB issues much easier to troubleshoot.Also applies to: 201-204
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
python/uv.lockis excluded by!**/*.lock
📒 Files selected for processing (6)
python/pyproject.toml(1 hunks)python/src/server/services/search/base_search_strategy.py(2 hunks)python/src/server/services/search/hybrid_search_strategy.py(2 hunks)python/src/server/services/storage/base_storage_service.py(1 hunks)python/src/server/utils/document_processing.py(5 hunks)python/src/server/utils/ocr_processing.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (5)
python/pyproject.toml
📄 CodeRabbit inference engine (AGENTS.md)
Use Python 3.12 for backend code
Files:
python/pyproject.toml
python/src/server/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
python/src/server/**/*.py: Fail fast and loud for service startup failures, missing configuration, database connection failures, authentication/authorization failures, data corruption, critical dependency unavailability, and invalid data that would corrupt state
Never accept corrupted data - skip failed items entirely rather than storing corrupted or incomplete data
Files:
python/src/server/utils/document_processing.pypython/src/server/utils/ocr_processing.pypython/src/server/services/search/hybrid_search_strategy.pypython/src/server/services/storage/base_storage_service.pypython/src/server/services/search/base_search_strategy.py
python/src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
python/src/**/*.py: Complete batch processing, background tasks, and external API calls but log detailed errors for each failure instead of crashing
Include context about what was being attempted, preserve full stack traces with exc_info=True in Python logging, use specific exception types, include relevant IDs/URLs, never return None to indicate failure, and report both success count and detailed failure list for batch operations
Use Ruff for linting and MyPy for type checking with 120 character line length in Python backend
Use Python 3.12 with 120 character line length in backend code
Import GITHUB_REPO_OWNER and GITHUB_REPO_NAME constants from python/src/server/config/version.py as the single source of truth for repository configuration
python/src/**/*.py: Python backend code should use 120 character line length
Use Ruff for linting Python code to check for errors, warnings, and unused imports
Use Mypy for type checking in Python to ensure type safety
Files:
python/src/server/utils/document_processing.pypython/src/server/utils/ocr_processing.pypython/src/server/services/search/hybrid_search_strategy.pypython/src/server/services/storage/base_storage_service.pypython/src/server/services/search/base_search_strategy.py
python/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Error messages must include context about what was being attempted, preserve full stack traces, use specific exception types, include relevant IDs/URLs, and report both success count and detailed failure list for batch operations
Files:
python/src/server/utils/document_processing.pypython/src/server/utils/ocr_processing.pypython/src/server/services/search/hybrid_search_strategy.pypython/src/server/services/storage/base_storage_service.pypython/src/server/services/search/base_search_strategy.py
python/src/server/services/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
python/src/server/services/**/*.py: For batch processing and background tasks, complete what you can and report detailed failures for each item rather than crashing
Never accept corrupted data - skip failed items entirely rather than storing corrupted data in batch operations
Files:
python/src/server/services/search/hybrid_search_strategy.pypython/src/server/services/storage/base_storage_service.pypython/src/server/services/search/base_search_strategy.py
🧬 Code graph analysis (1)
python/src/server/utils/document_processing.py (1)
python/src/server/utils/ocr_processing.py (2)
is_ocr_available(36-38)extract_text_with_ocr(41-124)
🔇 Additional comments (10)
python/src/server/services/storage/base_storage_service.py (3)
82-89: Heading boundary logic looks correct.Good addition of heading-based break points. The 30% threshold prevents excessively small chunks, and the code block awareness is properly implemented.
91-95: Paragraph break logic correctly enhanced with code-block awareness.The cascading check (
end == original_end) ensures proper priority ordering, and the code block guard prevents breaking mid-code.
97-101: Sentence break fallback correctly updated with code-block awareness.The priority cascading and code block guard are properly implemented. Including the period (
+ 1) in the current chunk is correct behavior.python/src/server/utils/document_processing.py (3)
13-18: LGTM on the availability check pattern.The try/except pattern for optional dependency detection is consistent with the existing patterns for PyPDF2, pdfplumber, and docx. This gracefully handles environments where pymupdf4llm isn't installed.
321-338: Good error handling with actionable messages.The OCR fallback path and error messages provide clear guidance for users, including installation instructions for missing dependencies. This aligns with the guideline to fail fast with descriptive errors when extraction completely fails.
47-83: LGTM on the code block preservation logic.The regex-based approach correctly handles code blocks split by page markers. Good use of
reversed()to maintain string positions during in-place replacements. The docstring clearly explains the problem being solved.python/src/server/utils/ocr_processing.py (2)
127-143: LGTM on language discovery logic.Good defensive implementation with a sensible fallback to
["eng"]when Tesseract's language list can't be retrieved. Filtering outosdis correct as it's a detection mode, not a language.
36-38: LGTM.Simple and effective availability check.
python/src/server/services/search/base_search_strategy.py (1)
48-76: Add guard for empty embeddings and enrich error logging contextThe multi-dimensional wiring (computing
embedding_dimension, usingf"{table_rpc}_multi", and logging the dimension) looks good, but two robustness gaps need attention per backend guidelines:
- Guard against empty/invalid
query_embeddingbefore deriving dimensionIf
query_embeddingisNoneor empty,len(query_embedding)will fail downstream. Add an early validation check that logs and returns cleanly:with safe_span("base_vector_search", table=table_rpc, match_count=match_count) as span: try: + if not query_embedding: + logger.error("Vector search received empty query_embedding") + span.set_attribute("error", "empty_query_embedding") + return [] + # Detect embedding dimension from the query vector embedding_dimension = len(query_embedding)
- Include stack trace and structured context in error logs
Update the exception handler to include
exc_info=Trueand contextual fields (table_rpc,match_count) as required by backend guidelines for production debuggability:except Exception as e: - logger.error(f"Vector search failed: {e}") + logger.error( + "Vector search failed for table_rpc=%s with match_count=%s", + table_rpc, + match_count, + exc_info=True, + )python/src/server/services/search/hybrid_search_strategy.py (1)
54-67: Avoid mutating caller filters and guard/document embedding assumptionsThe switch to
hybrid_search_archon_crawled_pages_multiwith an explicitembedding_dimensionis solid. A couple of small improvements would make this more robust and side‑effect free:
- Do not mutate
filter_metadataviapop
filter_json = filter_metadata or {}creates an alias; the subsequentpop("source", ...)mutates the original dict passed by the caller. That can be surprising if the same filter dict is reused elsewhere.You can cheaply avoid this by copying:
- filter_json = filter_metadata or {} + filter_json = dict(filter_metadata or {}) source_filter = filter_json.pop("source", None) if "source" in filter_json else None
- Explicitly handle empty/None
query_embeddingbefore computing lengthEven if callers are expected to always supply a valid embedding, a lightweight guard will prevent obscure failures when that assumption is violated:
- # Detect embedding dimension from the query vector - embedding_dimension = len(query_embedding) + if not query_embedding: + logger.error("Hybrid document search received empty query_embedding") + span.set_attribute("error", "empty_query_embedding") + return [] + + # Detect embedding dimension from the query vector + embedding_dimension = len(query_embedding)
- Align error logging with backend guidelines
As in
BaseSearchStrategy.vector_search, consider switching theexceptblock here to useexc_info=Trueand include relevant context (e.g.,match_count, whether asource_filterwas applied), so operational debugging is easier.Also applies to: 108-111
| "pymupdf4llm>=0.2.5", # Better PDF-to-Markdown extraction | ||
| "pytesseract>=0.3.10", # OCR for scanned/image-based PDFs | ||
| "pdf2image>=1.17.0", # PDF to image conversion for OCR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New OCR dependencies missing from the all dependency group.
The new dependencies (pymupdf4llm, pytesseract, pdf2image) are added to the server group but not to the all group (lines 119-159). The all group is documented as combining "all container dependencies" for local unit testing. This omission could cause test failures when running tests locally that exercise the OCR code paths.
Add these dependencies to the all group around line 130:
"pypdf2>=3.0.1",
"pdfplumber>=0.11.6",
+ "pymupdf4llm>=0.2.5",
+ "pytesseract>=0.3.10",
+ "pdf2image>=1.17.0",
"python-docx>=1.1.2",📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "pymupdf4llm>=0.2.5", # Better PDF-to-Markdown extraction | |
| "pytesseract>=0.3.10", # OCR for scanned/image-based PDFs | |
| "pdf2image>=1.17.0", # PDF to image conversion for OCR | |
| "pypdf2>=3.0.1", | |
| "pdfplumber>=0.11.6", | |
| "pymupdf4llm>=0.2.5", | |
| "pytesseract>=0.3.10", | |
| "pdf2image>=1.17.0", | |
| "python-docx>=1.1.2", |
🤖 Prompt for AI Agents
In python/pyproject.toml around lines 51-53 and the 'all' dependency group
region around lines 119-159 (add near line 130), the OCR packages `pymupdf4llm`,
`pytesseract`, and `pdf2image` were added to the `server` group but not included
in the `all` group; update the `all` extras list to include the same three
dependencies (with the same version constraints) so local testing pulls the OCR
deps—ensure formatting and trailing commas match existing list style.
| import io | ||
| import tempfile | ||
| import os | ||
| from typing import Optional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Remove unused imports.
tempfile and os are imported but never used in this module. The PDF-to-image conversion and OCR operations don't require temp file handling here since convert_from_bytes works directly with byte content.
-import io
import tempfile
import os
from typing import Optional
+from typing import OptionalActually, io also appears unused. Remove all three:
-import io
-import tempfile
-import os
from typing import Optional📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| import io | |
| import tempfile | |
| import os | |
| from typing import Optional | |
| from typing import Optional |
🤖 Prompt for AI Agents
In python/src/server/utils/ocr_processing.py around lines 13 to 16, the imports
tempfile, os, and io are unused; remove these unused imports and keep only the
necessary ones (for example typing.Optional and any others actually referenced
in the file) to eliminate dead code and satisfy the linter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
archon-ui-main/src/services/credentialsService.ts (1)
62-75: Persisting Ollama auth tokens as plaintext credentials
authTokenis now part ofOllamaInstanceand is persisted viasetOllamaInstanceswithis_encrypted: false, meaning these per‑instance tokens will be stored in clear text like non‑secret config fields. For remote/authenticated Ollama hosts these tokens are effectively API credentials and should get the same at‑rest protections as other API keys.Consider treating
authTokenas a secret when creating credentials, e.g. marking it encrypted (and aligning backend handling) while leaving non‑secret fields unchanged:- instances.forEach(instance => { + instances.forEach(instance => { const fields: Record<string, any> = { name: instance.name, baseUrl: instance.baseUrl, isEnabled: instance.isEnabled, isPrimary: instance.isPrimary, instanceType: instance.instanceType || 'both', loadBalancingWeight: instance.loadBalancingWeight || 100 }; @@ - if (instance.authToken) { - fields.authToken = instance.authToken; - } + if (instance.authToken) { + fields.authToken = instance.authToken; + } @@ - Object.entries(fields).forEach(([field, value]) => { + Object.entries(fields).forEach(([field, value]) => { + const isSecret = field === 'authToken'; promises.push( this.createCredential({ key: `ollama_instance_${instance.id}_${field}`, value: value.toString(), - is_encrypted: false, + is_encrypted: isSecret, category: 'ollama_instances' }) ); });You may need a matching backend adjustment so
ollama_instancescredentials withis_encrypted: trueare stored/decrypted consistently withapi_keys.Also applies to: 438-485, 496-551
archon-ui-main/src/components/settings/RAGSettings.tsx (1)
18-20: Remove OpenRouter from embedding-capable providers
EMBEDDING_CAPABLE_PROVIDERSincorrectly includes'openrouter'. OpenRouter only proxies chat/completions endpoints and does not provide embedding models. Including it here misrepresents its capabilities to users and can cause confusion, as embeddings would actually come from the underlying routed provider (OpenAI, etc.), not OpenRouter itself. This was correctly excluded in the prior version.-const EMBEDDING_CAPABLE_PROVIDERS: ProviderKey[] = ['openai', 'google', 'openrouter', 'ollama']; +const EMBEDDING_CAPABLE_PROVIDERS: ProviderKey[] = ['openai', 'google', 'ollama'];
♻️ Duplicate comments (3)
archon-ui-main/src/components/settings/RAGSettings.tsx (3)
142-175: Add missing Ollama auth/API-mode fields to RAGSettingsProps.ragSettingsThe component reads and writes
ragSettings.OLLAMA_CHAT_AUTH_TOKEN,ragSettings.OLLAMA_EMBEDDING_AUTH_TOKEN, andragSettings.OLLAMA_API_MODEin multiple places (effects, modals, API-mode toggle), but these fields are not declared on theragSettingstype. Under strict TypeScript this will be a type error and prevents catching typos on these keys.Extend the inline type to include the missing fields:
interface RAGSettingsProps { ragSettings: { MODEL_CHOICE: string; CHAT_MODEL?: string; @@ EMBEDDING_PROVIDER?: string; OLLAMA_EMBEDDING_URL?: string; OLLAMA_EMBEDDING_INSTANCE_NAME?: string; + OLLAMA_CHAT_AUTH_TOKEN?: string; + OLLAMA_EMBEDDING_AUTH_TOKEN?: string; + OLLAMA_API_MODE?: 'native' | 'openai-compatible'; // Crawling Performance SettingsAfter this, you can continue to access those properties directly without any
as anyworkarounds.As per coding guidelines.
571-584: Model loading behavior and default Ollama URL handlingThe new
loadAvailableModelshelper andollamaMetricsextensions are wired cleanly, and callingloadAvailableModelsafter a successfulmanualTestConnectionis a nice touch to keep model lists in sync.One lingering issue is the hardcoded default URL check in the auto‑load effects:
useEffect(() => { if (llmInstanceConfig.url && llmInstanceConfig.url !== 'http://host.docker.internal:11434/v1') { loadAvailableModels(llmInstanceConfig.url, 'chat'); } }, [llmInstanceConfig.url]); useEffect(() => { if (embeddingInstanceConfig.url && embeddingInstanceConfig.url !== 'http://host.docker.internal:11434/v1') { loadAvailableModels(embeddingInstanceConfig.url, 'embedding'); } }, [embeddingInstanceConfig.url]);This still skips auto‑loading models for users who keep the default local Ollama URL, forcing them to rely on the page‑load tests or manual “Test Connection” to populate model lists.
You can remove the special‑case while still guarding against empty values:
- useEffect(() => { - if (llmInstanceConfig.url && llmInstanceConfig.url !== 'http://host.docker.internal:11434/v1') { - loadAvailableModels(llmInstanceConfig.url, 'chat'); - } - }, [llmInstanceConfig.url]); + useEffect(() => { + if (llmInstanceConfig.url && llmInstanceConfig.url.trim()) { + loadAvailableModels(llmInstanceConfig.url, 'chat'); + } + }, [llmInstanceConfig.url]); @@ - useEffect(() => { - if (embeddingInstanceConfig.url && embeddingInstanceConfig.url !== 'http://host.docker.internal:11434/v1') { - loadAvailableModels(embeddingInstanceConfig.url, 'embedding'); - } - }, [embeddingInstanceConfig.url]); + useEffect(() => { + if (embeddingInstanceConfig.url && embeddingInstanceConfig.url.trim()) { + loadAvailableModels(embeddingInstanceConfig.url, 'embedding'); + } + }, [embeddingInstanceConfig.url]);This matches prior guidance and makes the behavior consistent for default vs custom URLs.
Also applies to: 641-692, 752-755, 891-909
2350-2366: Validate authentication tokens before saving Ollama instance settingsThe new auth sections in the LLM and Embedding modals let users enable
useAuthand enter anauthToken, and the save handlers persist tokens intoragSettings:LLM: OLLAMA_CHAT_AUTH_TOKEN: llmInstanceConfig.useAuth ? llmInstanceConfig.authToken : '' Embedding: OLLAMA_EMBEDDING_AUTH_TOKEN: embeddingInstanceConfig.useAuth ? embeddingInstanceConfig.authToken : ''However, if
useAuthis enabled while the token field is empty or whitespace, the code still saves an empty token. That will cause authenticated calls to fail without immediate feedback.Add a simple guard before constructing
updatedSettingsin both handlers:// LLM Save Changes <Button onClick={async () => { + if (llmInstanceConfig.useAuth && !llmInstanceConfig.authToken.trim()) { + showToast('Authentication token is required when authentication is enabled', 'error'); + return; + } const updatedSettings = { ...ragSettings, LLM_BASE_URL: llmInstanceConfig.url, OLLAMA_CHAT_AUTH_TOKEN: llmInstanceConfig.useAuth ? llmInstanceConfig.authToken : '' }; @@ // Embedding Save Changes <Button onClick={async () => { + if (embeddingInstanceConfig.useAuth && !embeddingInstanceConfig.authToken.trim()) { + showToast('Authentication token is required when authentication is enabled', 'error'); + return; + } const updatedSettings = { ...ragSettings, OLLAMA_EMBEDDING_URL: embeddingInstanceConfig.url, OLLAMA_EMBEDDING_AUTH_TOKEN: embeddingInstanceConfig.useAuth ? embeddingInstanceConfig.authToken : '' };This keeps the UX tight and avoids silently persisting unusable auth configurations.
Also applies to: 2368-2391, 2403-2411, 2443-2456, 2458-2481, 2493-2501
🧹 Nitpick comments (2)
archon-ui-main/src/services/credentialsService.ts (1)
13-45: RagSettings additions for CHAT_MODEL and OLLAMA_API_MODE look consistentDefaults plus string-field mapping for
CHAT_MODELandOLLAMA_API_MODEare wired correctly intogetRagSettings, so these values will survive round‑trips through credentials. Just ensure all consumers (e.g.,RAGSettings.tsx’s props type) declare the same fields to keep TypeScript happy; I’ve flagged that separately in the UI file.Also applies to: 191-281
archon-ui-main/src/components/settings/RAGSettings.tsx (1)
225-272: Align auth token state, persistence, and summary displayThe new refs/effects correctly sync
OLLAMA_*_AUTH_TOKENfromragSettingsintollmInstanceConfig/embeddingInstanceConfig, but there’s a subtle mismatch with how the “Authentication” summary row is rendered:
- Effects set
useAuth: !!newAuthTokenbut keepauthToken: newAuthToken || prev.authToken.- The summary row at lines 1852–1879 uses only
authTokentruthiness to decide between “Token configured” and “No authentication”.That means if a user disables auth (useAuth = false) but the previous
authTokenstring remains in state (or whenragSettingsclears the token but local state still hasprev.authToken), the summary can still show “Token configured” even though auth is off and no token will be sent.You can make the UI match actual behavior by keying off
useAuth && authToken:- {activeSelection === 'chat' ? ( - llmInstanceConfig.authToken ? ( + {activeSelection === 'chat' ? ( + llmInstanceConfig.useAuth && llmInstanceConfig.authToken ? ( <span className="text-teal-400 flex items-center"> @@ - ) : ( - <span className="text-gray-500 italic">No authentication</span> - ) - ) : ( - embeddingInstanceConfig.authToken ? ( + ) : ( + <span className="text-gray-500 italic">No authentication</span> + ) + ) : ( + embeddingInstanceConfig.useAuth && embeddingInstanceConfig.authToken ? ( <span className="text-teal-400 flex items-center"> @@ - ) : ( - <span className="text-gray-500 italic">No authentication</span> - ) - )} + ) : ( + <span className="text-gray-500 italic">No authentication</span> + ) + )}Optionally, you could also drop the
|| prev.authTokenfallback in the effects if you want state to always mirror what’s actually persisted inragSettings.Also applies to: 1852-1879
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
archon-ui-main/src/components/settings/RAGSettings.tsx(20 hunks)archon-ui-main/src/services/credentialsService.ts(6 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
archon-ui-main/src/**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
archon-ui-main/src/**/*.{ts,tsx}: Use TypeScript in strict mode with no implicit any in frontend code
Do not use dynamic Tailwind class construction in frontend components
Files:
archon-ui-main/src/services/credentialsService.tsarchon-ui-main/src/components/settings/RAGSettings.tsx
archon-ui-main/src/!(features)/**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
Use ESLint with standard React rules for legacy code in the frontend
Files:
archon-ui-main/src/services/credentialsService.tsarchon-ui-main/src/components/settings/RAGSettings.tsx
archon-ui-main/src/**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (AGENTS.md)
Use ESLint for legacy frontend code outside /src/features directory with standard React rules
Files:
archon-ui-main/src/services/credentialsService.tsarchon-ui-main/src/components/settings/RAGSettings.tsx
🧠 Learnings (3)
📚 Learning: 2025-11-29T19:24:36.223Z
Learnt from: CR
Repo: coleam00/Archon PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-29T19:24:36.223Z
Learning: Applies to archon-ui-main/src/**/*.{ts,tsx} : Use TypeScript in strict mode with no implicit any in frontend code
Applied to files:
archon-ui-main/src/components/settings/RAGSettings.tsx
📚 Learning: 2025-11-29T19:25:23.987Z
Learnt from: CR
Repo: coleam00/Archon PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-29T19:25:23.987Z
Learning: Applies to archon-ui-main/src/features/**/*.{ts,tsx} : TypeScript strict mode enabled with no implicit any, and use Biome with 120 character lines and double quotes in /src/features directory
Applied to files:
archon-ui-main/src/components/settings/RAGSettings.tsx
📚 Learning: 2025-09-25T16:36:48.527Z
Learnt from: Chillbruhhh
Repo: coleam00/Archon PR: 736
File: archon-ui-main/src/components/settings/RAGSettings.tsx:17-19
Timestamp: 2025-09-25T16:36:48.527Z
Learning: OpenRouter does not provide embedding models - it only acts as a proxy/router for various LLM providers for chat completions, not embeddings. The EMBEDDING_CAPABLE_PROVIDERS array in RAGSettings.tsx correctly excludes 'openrouter'.
Applied to files:
archon-ui-main/src/components/settings/RAGSettings.tsx
🧬 Code graph analysis (1)
archon-ui-main/src/components/settings/RAGSettings.tsx (2)
archon-ui-main/src/components/settings/utils/instanceConfigSync.ts (1)
syncEmbeddingFromLLM(18-28)archon-ui-main/src/features/ui/primitives/input.tsx (1)
Input(8-29)
🔇 Additional comments (1)
archon-ui-main/src/components/settings/RAGSettings.tsx (1)
1600-1659: Ollama API mode toggle wiring looks correctThe new “Ollama API Mode” selector cleanly toggles
ragSettings.OLLAMA_API_MODEbetween'native'and'openai-compatible', with sensible defaulting to native when the value is missing:(!ragSettings.OLLAMA_API_MODE || ragSettings.OLLAMA_API_MODE === 'native')and
onClick={() => setRagSettings({ ...ragSettings, OLLAMA_API_MODE: 'native' })} onClick={() => setRagSettings({ ...ragSettings, OLLAMA_API_MODE: 'openai-compatible' })}Once the props type is updated to declare
OLLAMA_API_MODE, this should integrate smoothly with the backend’s mode selection logic.
Summary
This PR adds comprehensive Ollama support to Archon, enabling users to use local LLM models for both chat and embeddings.
Features Added:
Technical Changes:
python/src/server/api_routes/ollama_api.py- Extended API endpointspython/src/server/services/embeddings/embedding_service.py- Ollama embedding supportpython/src/server/services/llm_provider_service.py- Ollama LLM providerpython/src/server/services/ollama/model_discovery_service.py- Model discoveryarchon-ui-main/src/components/settings/OllamaConfigurationPanel.tsx- UI panelarchon-ui-main/src/components/settings/RAGSettings.tsx- Updated settingsConfiguration:
Users can configure Ollama via the Settings page:
Test plan
archon-ui-main/tests/e2e/ollama-api-mode.spec.ts🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Documentation
Improvements
Tests
Chores
Style
✏️ Tip: You can customize this high-level summary in your review settings.