feat: Add comprehensive OpenAI base URL configuration support#585
feat: Add comprehensive OpenAI base URL configuration support#585orestesgarcia wants to merge 18 commits intocoleam00:mainfrom
Conversation
This commit implements comprehensive support for custom OpenAI base URL configuration throughout the entire Archon system, enabling users to route OpenAI API calls through proxies like LiteLLM, Azure OpenAI, or other OpenAI-compatible endpoints. ## Key Changes: ### Backend Services - Enhanced credential_service.py to support OPENAI_BASE_URL setting - Updated llm_provider_service.py to use custom base_url for AsyncOpenAI clients - Modified code_storage_service.py to support base_url for sync OpenAI clients ### PydanticAI Agent Integration - Created new agent_provider_config.py for centralized provider configuration - Updated base_agent.py to support async initialization with custom providers - Modified rag_agent.py and document_agent.py to use configured providers - Updated agent server.py to handle async provider initialization ### Frontend UI - Enhanced RAGSettings.tsx with conditional OpenAI base URL input field - Added OPENAI_BASE_URL to TypeScript interface - Field appears only when OpenAI is selected as provider ## Features: - Optional configuration (backwards compatible) - Consistent behavior across all OpenAI usage points - Secure encrypted storage in database - Comprehensive error handling and fallbacks - Support for LiteLLM, Azure OpenAI, and other compatible endpoints ## Testing: - All Python files compile without syntax errors - Linting issues addressed - No breaking changes to existing functionality Resolves: coleam00#584
|
Warning Rate limit exceeded@orestesgarcia has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 8 minutes and 37 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (4)
WalkthroughAdds optional OPENAI_BASE_URL across UI and backend, introduces centralized agent provider configuration, updates agents for async lazy initialization and runtime model resolution, and makes OpenAI clients (async/sync) honor a configured base URL and API key. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant UI as RAG Settings UI
participant Svc as Credential Service
participant Prov as Agent Provider Config
participant Agent as Base/Doc/RAG Agent
participant LLM as LLM Provider Service
participant Store as Code Storage Service
UI->>Svc: save rag_settings (OPENAI_BASE_URL)
Note over Agent,Prov: Agent requests model resolution at runtime
Agent->>Prov: get_configured_openai_model("gpt-x")
Prov->>Svc: read OPENAI_BASE_URL / OPENAI_API_KEY (rag_strategy or env)
alt base_url + api_key present
Prov-->>Agent: OpenAIChatModel(model, provider=OpenAIProvider(base_url, api_key))
else no base_url
Prov-->>Agent: "openai:<model>"
end
Agent->>LLM: get_llm_client(provider=openai)
LLM->>Svc: _get_provider_base_url(openai)
alt base_url set
LLM-->>Agent: AsyncOpenAI(api_key, base_url)
else
LLM-->>Agent: AsyncOpenAI(api_key)
end
Agent->>Store: build sync OpenAI client for embeddings/code ops
Store->>Svc: read OPENAI_API_KEY / OPENAI_BASE_URL (sync)
Store-->>Agent: OpenAI(api_key[, base_url])
sequenceDiagram
participant Server as Agent Server
participant Agent as Agent
participant Stream as Stream CM
Server->>Agent: run_stream(prompt, deps)
Agent-->>Server: awaitable -> async context manager
Server->>Stream: async with (awaitable result) as stream
loop streaming
Stream-->>Server: chunk/token
Server-->>Client: SSE
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Assessment against linked issues
Suggested labels
Suggested reviewers
Poem
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (5)
python/src/server/services/credential_service.py (1)
470-475: Use uppercase “LLM_PROVIDER” key to avoid mismatches
Other code (reads, tests, credential listing) consistently uses “LLM_PROVIDER”, so writing “llm_provider” here won’t be picked up. Update the call incredential_service.py:- return await self.set_credential( - "llm_provider", + return await self.set_credential( + "LLM_PROVIDER", provider, category="rag_strategy", description=f"Active {service_type} provider",python/src/agents/base_agent.py (4)
246-251: Preserve stack traces and avoid swallowing cancellations.Add
CancelledErrorpassthrough and log errors withexc_info=True.except asyncio.TimeoutError: - self.logger.error(f"Agent {self.name} timed out after 120 seconds") + self.logger.error(f"Agent {self.name} timed out after 120 seconds", exc_info=True) raise Exception(f"Agent {self.name} operation timed out - taking too long to respond") - except Exception as e: - self.logger.error(f"Agent {self.name} failed: {str(e)}") + except asyncio.CancelledError: + self.logger.warning(f"Agent {self.name} run was cancelled") + raise + except Exception: + self.logger.error(f"Agent {self.name} failed", exc_info=True) raise
272-281: Guard tool registration against uninitialized agent.Lazy init makes this callable raise
AttributeErrorif used before init. Fail fast with a clear message.def add_tool(self, func, **tool_kwargs): """ Add a tool function to the agent. @@ - return self._agent.tool(**tool_kwargs)(func) + if self._agent is None: + raise RuntimeError( + "Agent not initialized. Register tools inside _create_agent() or initialize the agent before add_tool()." + ) + return self._agent.tool(**tool_kwargs)(func)
282-290: Same guard for dynamic system prompts.def add_system_prompt_function(self, func): @@ - return self._agent.system_prompt(func) + if self._agent is None: + raise RuntimeError( + "Agent not initialized. Register system prompts inside _create_agent() or initialize the agent first." + ) + return self._agent.system_prompt(func)
291-295: Avoid returning possibly-None agent.Either raise if uninitialized (fail fast) or annotate as
Agent | None. Prefer raising for correctness.@property def agent(self) -> Agent: """Get the underlying PydanticAI agent instance.""" - return self._agent + if self._agent is None: + raise RuntimeError("Agent not initialized yet") + return self._agent
🧹 Nitpick comments (8)
python/src/server/services/credential_service.py (1)
461-464: Normalize and coerce blank OPENAI_BASE_URL to NoneAvoid persisting/propagating empty strings; trim and return None to signal “use default”.
- elif provider == "openai": - # Allow custom OpenAI-compatible endpoint - return rag_settings.get("OPENAI_BASE_URL") + elif provider == "openai": + # Allow custom OpenAI-compatible endpoint + return (str(rag_settings.get("OPENAI_BASE_URL") or "").strip() or None)python/src/server/services/storage/code_storage_service.py (2)
569-575: Use explicit OpenAI client import for consistencyStyle-only; keeps usage in line with SDK docs.
- client = openai.OpenAI(**client_kwargs) + from openai import OpenAI + client = OpenAI(**client_kwargs)
588-599: Set a request timeout on the OpenAI callPrevents indefinite hangs and aligns with “fail fast” guidance.
- response = client.chat.completions.create( + response = client.chat.completions.create( model=model_choice, messages=[ { "role": "system", "content": "You are a helpful assistant that analyzes code examples and provides JSON responses with example names and summaries.", }, {"role": "user", "content": prompt}, ], response_format={"type": "json_object"}, + timeout=30, )archon-ui-main/src/components/settings/RAGSettings.tsx (1)
89-102: Trim and avoid persisting empty OPENAI_BASE_URLStore undefined when the input is blank; prevents saving empty strings that later need normalization server-side.
- <Input + <Input label="OpenAI Base URL (optional)" value={ragSettings.OPENAI_BASE_URL || ''} - onChange={e => setRagSettings({ - ...ragSettings, - OPENAI_BASE_URL: e.target.value - })} + onChange={e => { + const value = e.target.value.trim(); + setRagSettings({ + ...ragSettings, + OPENAI_BASE_URL: value || undefined + }); + }} placeholder="https://api.openai.com/v1" accentColor="green" />python/src/server/services/llm_provider_service.py (1)
100-109: Harden OpenAI client instantiation: add timeouts/retries and validate base_url--- a/python/src/server/services/llm_provider_service.py +++ b/python/src/server/services/llm_provider_service.py @@ Lines 100-109 - client_kwargs = {"api_key": api_key} + client_kwargs = {"api_key": api_key, "timeout": 60.0, "max_retries": 3} if base_url: - client_kwargs["base_url"] = base_url - logger.info(f"OpenAI client created with custom base URL: {base_url}") + from urllib.parse import urlparse + if not urlparse(base_url).scheme: + raise ValueError("Invalid OPENAI_BASE_URL: missing scheme (expected http/https)") + client_kwargs["base_url"] = base_url + logger.info(f"OpenAI client created with custom base URL: {base_url}") else: - logger.info("OpenAI client created with default URL") + logger.debug("OpenAI client created with default URL") client = openai.AsyncOpenAI(**client_kwargs)python/src/agents/agent_provider_config.py (1)
61-87: Validate base URL format to fail fast on bad configAvoids late failures deep in provider code.
- base_url = rag_settings.get("OPENAI_BASE_URL") + base_url = rag_settings.get("OPENAI_BASE_URL") - if base_url: - logger.debug(f"Found OPENAI_BASE_URL in settings: {base_url}") - return base_url + if base_url: + from urllib.parse import urlparse + if not urlparse(base_url).scheme: + raise ValueError("Invalid OPENAI_BASE_URL in settings: missing scheme (expected http/https)") + logger.debug(f"Found OPENAI_BASE_URL in settings: {base_url}") + return base_urlpython/src/agents/base_agent.py (2)
253-271: Streaming path: initialization LGTM; document no-retry behavior.Since streaming skips rate limiting/retries by design, add a brief doc note warning callers that 429s will surface immediately so upstream should handle retry UX.
127-138: Avoid bareexceptin wait-time parser.Use
except Exceptionand keep behavior; meets linting guidelines.- except: - pass + except Exception: + # Ignore parse issues; fall back to exponential backoff + pass
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (9)
archon-ui-main/src/components/settings/RAGSettings.tsx(2 hunks)python/src/agents/agent_provider_config.py(1 hunks)python/src/agents/base_agent.py(4 hunks)python/src/agents/document_agent.py(1 hunks)python/src/agents/rag_agent.py(1 hunks)python/src/agents/server.py(1 hunks)python/src/server/services/credential_service.py(1 hunks)python/src/server/services/llm_provider_service.py(1 hunks)python/src/server/services/storage/code_storage_service.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (7)
python/src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
python/src/**/*.py: Fail fast on service startup failures (crash with clear error if credentials, database, or any service cannot initialize)
Fail fast on missing configuration or invalid environment settings
Fail fast on database connection failures; do not hide connection issues
Fail fast on authentication/authorization failures; halt the operation and surface the error
Fail fast on data corruption or validation errors; let Pydantic raise
Fail fast when critical dependencies are unavailable (required service down)
Never store invalid data that would corrupt state (e.g., zero embeddings, null foreign keys, malformed JSON); fail instead
For batch processing, complete what you can and log detailed failures per item
Background tasks should finish queues but log failures clearly
Do not crash on a single WebSocket/event failure; log and continue serving other clients
If optional features are disabled, log and skip rather than crashing
External API calls should retry with exponential backoff; then fail with a clear, specific error
When continuing after a failure, skip the failed item entirely; never persist partial or corrupted results
Include context about the attempted operation in error messages
Preserve full stack traces with exc_info=True in Python logging
Use specific exception types; avoid catching generic Exception
Never return None to indicate failure; raise an exception with details
For batch operations, report both success counts and detailed failure lists
Target Python 3.12 and keep line length at 120 characters
Use Ruff for linting (errors, warnings, unused imports, style) and keep code Ruff-clean
Use Mypy for static type checking and keep code type-safe
Enable auto-formatting on save in IDEs to maintain consistent Python style
Files:
python/src/agents/document_agent.pypython/src/agents/server.pypython/src/agents/rag_agent.pypython/src/server/services/llm_provider_service.pypython/src/server/services/credential_service.pypython/src/agents/agent_provider_config.pypython/src/server/services/storage/code_storage_service.pypython/src/agents/base_agent.py
python/src/agents/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Place PydanticAI agent implementations under python/src/agents/
Files:
python/src/agents/document_agent.pypython/src/agents/server.pypython/src/agents/rag_agent.pypython/src/agents/agent_provider_config.pypython/src/agents/base_agent.py
archon-ui-main/**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
archon-ui-main/**/*.{ts,tsx}: Never return null to indicate failure in the frontend; throw an Error with details instead
Use database task status values directly in the UI with no mapping: todo, doing, review, done
Files:
archon-ui-main/src/components/settings/RAGSettings.tsx
archon-ui-main/src/components/**
📄 CodeRabbit inference engine (CLAUDE.md)
Place reusable UI components under archon-ui-main/src/components/
Files:
archon-ui-main/src/components/settings/RAGSettings.tsx
archon-ui-main/src/{components,hooks,pages}/**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
archon-ui-main/src/{components,hooks,pages}/**/*.{ts,tsx}: State naming: use is[Action]ing for loading states (e.g., isSwitchingProject)
State naming: use [resource]Error for error messages
State naming: use selected[Resource] for current selections
Files:
archon-ui-main/src/components/settings/RAGSettings.tsx
python/src/server/**
📄 CodeRabbit inference engine (CLAUDE.md)
Keep the main FastAPI application under python/src/server/
Files:
python/src/server/services/llm_provider_service.pypython/src/server/services/credential_service.pypython/src/server/services/storage/code_storage_service.py
python/src/server/services/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Put backend business logic services under python/src/server/services/
Files:
python/src/server/services/llm_provider_service.pypython/src/server/services/credential_service.pypython/src/server/services/storage/code_storage_service.py
🧬 Code graph analysis (6)
python/src/agents/document_agent.py (2)
python/src/agents/base_agent.py (3)
_create_agent(182-184)_get_configured_model(191-205)agent(292-294)python/src/agents/rag_agent.py (1)
_create_agent(71-317)
python/src/agents/server.py (1)
python/src/agents/base_agent.py (2)
agent(292-294)run_stream(253-270)
python/src/agents/rag_agent.py (2)
python/src/agents/base_agent.py (3)
_create_agent(182-184)_get_configured_model(191-205)agent(292-294)python/src/agents/document_agent.py (1)
_create_agent(73-663)
python/src/agents/agent_provider_config.py (1)
python/src/server/services/credential_service.py (3)
get_credentials_by_category(277-323)get_credential(158-175)get_credential(485-487)
python/src/server/services/storage/code_storage_service.py (1)
python/src/server/services/credential_service.py (1)
_decrypt_value(110-122)
python/src/agents/base_agent.py (3)
python/src/agents/document_agent.py (1)
_create_agent(73-663)python/src/agents/rag_agent.py (1)
_create_agent(71-317)python/src/agents/agent_provider_config.py (1)
get_configured_openai_model(17-58)
🔇 Additional comments (7)
archon-ui-main/src/components/settings/RAGSettings.tsx (1)
20-20: Type addition for OPENAI_BASE_URL looks goodMatches backend expectations and keeps the field optional.
python/src/agents/rag_agent.py (1)
71-79: LGTM: async model resolution integrated correctlyAgent now defers model selection via
_get_configured_model(), aligning with provider-config flow.python/src/agents/server.py (1)
254-255: Correct streaming patternAwaiting
agent.run_stream(...)to get the async context manager and thenasync with-ing it matches the updated BaseAgent contract.python/src/agents/document_agent.py (1)
73-81: LGTM: async_create_agentwith configured modelSwitch to
configured_modelkeeps the agent consistent with the new provider configuration path.python/src/agents/base_agent.py (3)
159-167: Constructor param LGTM; default preserves backward compatibility.
use_custom_provider=Trueis a good default and matches the PR objective. No issues.
235-237: Good: initialize before run.This closes the race where
_agentmight be None at first call. With the lock above, this is solid.
182-184: No non-async_create_agentoverrides found—breaking signature change is safe.
This commit addresses all review feedback from PR coleam00#585: 1. **Fail Fast on Missing API Key**: - Raise ValueError when OPENAI_BASE_URL is configured but API key is missing - Prevents traffic leaking to public endpoints when proxy is explicitly configured 2. **Thread-Safe Agent Initialization**: - Added asyncio.Lock to prevent race conditions in lazy initialization - Double-check pattern prevents double initialization in concurrent calls 3. **Improved Model Parsing**: - Use removeprefix() instead of replace() for safer model name extraction - Add validation for empty model names after prefix removal - Enhanced error logging with exc_info=True for full stack traces 4. **Better Credential Service Integration**: - Added environment variable fallback for OPENAI_BASE_URL - Improved error handling for credential retrieval - Added proper URL normalization (trim whitespace, handle empty strings) 5. **URL Validation and Security**: - Added comprehensive URL validation with proper scheme/hostname checks - Security warnings for non-HTTPS URLs in production - Fail-fast on invalid URL configurations ## Security Improvements: - Prevents accidental traffic leakage to public APIs when proxy is configured - Validates URL formats to prevent configuration errors - Warns about non-HTTPS usage in production environments ## Robustness Improvements: - Thread-safe initialization prevents race conditions - Better error messages with full stack traces - Graceful handling of credential retrieval failures - Proper fallback chains for configuration sources Addresses: coleam00#585 review feedback
There was a problem hiding this comment.
Actionable comments posted: 5
♻️ Duplicate comments (1)
python/src/agents/base_agent.py (1)
187-194: Concurrency-safe lazy init implemented correctly.Double-checked locking with asyncio.Lock avoids double initialization.
🧹 Nitpick comments (5)
python/src/agents/agent_provider_config.py (4)
34-36: Normalize trailing slash.Avoid double-slash issues when clients join paths.
Apply:
- url = base_url.strip() + url = base_url.strip().rstrip("/")
129-136: Sanitize base URL in logs.Avoid logging userinfo/query tokens if present.
Apply:
- logger.debug(f"Found OPENAI_BASE_URL in settings: {base_url}") + logger.debug("Found OPENAI_BASE_URL in settings: %s", _sanitize_url_for_logging(base_url)) @@ - logger.debug(f"Found OPENAI_BASE_URL in environment: {env_base_url}") + logger.debug("Found OPENAI_BASE_URL in environment: %s", _sanitize_url_for_logging(env_base_url))Add helper (outside the shown range, e.g., below imports):
def _sanitize_url_for_logging(url: str) -> str: try: p = urlparse(url.strip()) host = p.hostname or "" if p.port: host += f":{p.port}" return f"{p.scheme}://{host}{p.path}" except Exception: return "<invalid-url>"
140-144: Use warning level and include trace on settings fallback.This is noteworthy behavior; keep the trace.
Apply:
- except Exception as e: - logger.debug(f"Could not get OPENAI_BASE_URL from settings: {e}") + except Exception: + logger.warning("Could not get OPENAI_BASE_URL from settings; falling back to env", exc_info=True)
166-170: Use warning level and include trace on API key fallback.Apply:
- except Exception as e: - logger.debug(f"Could not get OPENAI_API_KEY from settings: {e}") + except Exception: + logger.warning("Could not get OPENAI_API_KEY from settings; falling back to env", exc_info=True)python/src/agents/base_agent.py (1)
180-181: Fix comment: it's concurrency-safe, not thread-safe.Asyncio.Lock guards coroutines, not threads.
Apply:
- self._init_lock = asyncio.Lock() # Thread-safe initialization lock + self._init_lock = asyncio.Lock() # Concurrency-safe (async) initialization lock
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (3)
python/src/agents/agent_provider_config.py(1 hunks)python/src/agents/base_agent.py(4 hunks)python/src/server/services/storage/code_storage_service.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- python/src/server/services/storage/code_storage_service.py
🧰 Additional context used
📓 Path-based instructions (2)
python/src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
python/src/**/*.py: Fail fast on service startup failures (crash with clear error if credentials, database, or any service cannot initialize)
Fail fast on missing configuration or invalid environment settings
Fail fast on database connection failures; do not hide connection issues
Fail fast on authentication/authorization failures; halt the operation and surface the error
Fail fast on data corruption or validation errors; let Pydantic raise
Fail fast when critical dependencies are unavailable (required service down)
Never store invalid data that would corrupt state (e.g., zero embeddings, null foreign keys, malformed JSON); fail instead
For batch processing, complete what you can and log detailed failures per item
Background tasks should finish queues but log failures clearly
Do not crash on a single WebSocket/event failure; log and continue serving other clients
If optional features are disabled, log and skip rather than crashing
External API calls should retry with exponential backoff; then fail with a clear, specific error
When continuing after a failure, skip the failed item entirely; never persist partial or corrupted results
Include context about the attempted operation in error messages
Preserve full stack traces with exc_info=True in Python logging
Use specific exception types; avoid catching generic Exception
Never return None to indicate failure; raise an exception with details
For batch operations, report both success counts and detailed failure lists
Target Python 3.12 and keep line length at 120 characters
Use Ruff for linting (errors, warnings, unused imports, style) and keep code Ruff-clean
Use Mypy for static type checking and keep code type-safe
Enable auto-formatting on save in IDEs to maintain consistent Python style
Files:
python/src/agents/agent_provider_config.pypython/src/agents/base_agent.py
python/src/agents/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Place PydanticAI agent implementations under python/src/agents/
Files:
python/src/agents/agent_provider_config.pypython/src/agents/base_agent.py
🧬 Code graph analysis (2)
python/src/agents/agent_provider_config.py (1)
python/src/server/services/credential_service.py (3)
get_credentials_by_category(277-323)get_credential(158-175)get_credential(485-487)
python/src/agents/base_agent.py (3)
python/src/agents/rag_agent.py (1)
_create_agent(71-317)python/src/agents/document_agent.py (1)
_create_agent(73-663)python/src/agents/agent_provider_config.py (1)
get_configured_openai_model(61-115)
🔇 Additional comments (2)
python/src/agents/base_agent.py (2)
247-249: Initialization before run: LGTM.Ensures agent exists before execution.
276-282: Streaming path init: LGTM.Pre-initializing avoids races when acquiring the stream context.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
- Add comprehensive OpenAI base URL configuration section to configuration.mdx - Include UI and environment variable configuration examples - Document LiteLLM, Azure OpenAI, and corporate proxy setups in rag.mdx - Add security considerations for proxy usage including HTTPS requirements - Update API reference with OPENAI_BASE_URL setting documentation - Include practical examples for common proxy scenarios
- Add get_openai_client_config() function to agent_provider_config.py - Provides centralized OpenAI configuration for all openai.AsyncOpenAI clients - Honors OPENAI_BASE_URL settings with same validation as PydanticAI agents - Maintains fail-fast behavior for security (requires API key when base URL configured) - Update llm_provider_service.py to use centralized configuration - Replace direct openai.AsyncOpenAI instantiation with centralized config - Ensures all OpenAI API calls honor custom base URLs (LiteLLM, Azure OpenAI, etc.) - Maintain backward compatibility with existing provider-specific configuration - Maintain consistent behavior across all OpenAI usage: - PydanticAI agents (via get_configured_openai_model) - LLM provider service (via get_openai_client_config) - Same security checks and validation logic applied everywhere Fixes issue where llm_provider_service bypassed centralized OPENAI_BASE_URL configuration
- Clarify that OPENAI_BASE_URL applies system-wide to all services - Emphasize consistent proxy routing across PydanticAI agents and LLM service - Update configuration.mdx to highlight centralized behavior - Update rag.mdx to clarify proxy settings apply to all OpenAI usage - Update api-reference.mdx to indicate system-wide application - Improve security documentation to cover all OpenAI API usage
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
python/src/server/services/llm_provider_service.py (1)
150-153: Include exc_info in provider creation errors.Keep full traceback for observability.
- logger.error( - f"Error creating LLM client for provider {provider_name if 'provider_name' in locals() else 'unknown'}: {e}" - ) + logger.error( + "Error creating LLM client for provider %s", + provider_name if 'provider_name' in locals() else 'unknown', + exc_info=True, + )python/src/agents/base_agent.py (3)
92-95: Raise a specific error on rate-limit exhaustion.Avoid generic Exception; surface a typed error.
- raise Exception( - f"Rate limit exceeded after {self.max_retries} retries: {full_error}" - ) + raise RateLimitExceededError( + f"Rate limit exceeded after {self.max_retries} retries: {full_error}" + )Add once at top-level (outside diff context):
class RateLimitExceededError(Exception): pass
265-271: Preserve stack traces and raise a typed error.Log with exc_info and wrap in a domain-specific exception.
- except Exception as e: - self.logger.error(f"Agent {self.name} failed: {str(e)}") - raise + except Exception as e: + self.logger.error("Agent %s failed", self.name, exc_info=True) + raise AgentExecutionError(f"Agent {self.name} failed") from eAdd once at top-level (outside diff context):
class AgentExecutionError(Exception): pass
310-314: Make agent property safe (None until initialized).Either return Optional or raise; here we raise for clarity.
- def agent(self) -> Agent: - """Get the underlying PydanticAI agent instance.""" - return self._agent + def agent(self) -> Agent: + """Get the underlying PydanticAI agent instance.""" + if self._agent is None: + raise RuntimeError("Agent not initialized; call await _ensure_agent_initialized() first") + return self._agent
🧹 Nitpick comments (8)
python/src/agents/agent_provider_config.py (1)
58-61: Broaden “local” detection for HTTP base URLs (reduce noisy warnings for RFC1918/private hosts).Treat loopback and private subnets as “local” to avoid warning noise for corporate proxies.
+import ipaddress @@ - if parsed.scheme == 'http' and not parsed.hostname.startswith(('localhost', '127.0.0.1', '0.0.0.0')): - logger.warning(f"Using non-HTTPS URL for OpenAI base URL: {url}. Consider using HTTPS for production.") + is_private = False + try: + host_ip = ipaddress.ip_address(parsed.hostname) + is_private = host_ip.is_loopback or host_ip.is_private + except ValueError: + # Not an IP literal; accept common local hostnames + is_private = parsed.hostname in ('localhost', 'host.docker.internal') + if parsed.scheme == 'http' and not is_private: + logger.warning(f"Using non-HTTPS URL for OpenAI base URL: {url}. Consider using HTTPS for production.")Also applies to: 8-11
python/src/server/services/llm_provider_service.py (1)
63-76: Use publicget_active_provider()instead of private methods
Replace the direct calls to_get_provider_api_key(provider)and_get_provider_base_url(provider, rag_settings)with the public API:config = await credential_service.get_active_provider() api_key = config["api_key"] base_url = config["base_url"]This avoids tying your code to internal implementations that may change.
python/src/agents/base_agent.py (2)
136-137: Avoid bare except.Limit to Exception to not swallow BaseException (e.g., CancelledError, KeyboardInterrupt).
- except: + except Exception: pass
291-300: Guard tool/system-prompt registration before initialization.Prevent None deref; fail fast with a clear message.
def add_tool(self, func, **tool_kwargs): @@ - return self._agent.tool(**tool_kwargs)(func) + if self._agent is None: + raise RuntimeError("Agent not initialized; call await _ensure_agent_initialized() before add_tool()") + return self._agent.tool(**tool_kwargs)(func) @@ def add_system_prompt_function(self, func): @@ - return self._agent.system_prompt(func) + if self._agent is None: + raise RuntimeError("Agent not initialized; call await _ensure_agent_initialized() before add_system_prompt_function()") + return self._agent.system_prompt(func)docs/docs/configuration.mdx (2)
144-147: Minor casing consistencyUse sentence case to match nearby bullets.
-- **Custom Base URLs**: Configure custom endpoints for OpenAI-compatible proxies +- **Custom base URLs**: Configure custom endpoints for OpenAI-compatible proxies
149-194: Document base-URL normalization to prevent double or missing /v1Without this, users can hit 404s (e.g., .../v1/v1 or missing /v1). Add a short note in the UI section.
3. Enter your custom **OpenAI Base URL** (optional) 4. Examples: - LiteLLM: `http://localhost:8000/v1` - Azure OpenAI: `https://your-resource.openai.azure.com/openai/deployments/your-deployment` - Custom proxy: `https://api.yourcompany.com/openai/v1` + +<Admonition type="tip" title="Base URL format"> +Include the API version path `/v1` exactly once in the base URL and avoid a trailing slash (use `.../v1`, not `.../v1/`). +</Admonition>Also, please confirm and document the precedence between UI settings and environment variables (which wins at runtime?) to avoid confusion.
docs/docs/rag.mdx (2)
306-312: Add trailing-slash guidance in the examplePrevents common misconfigurations.
# Optional: Custom OpenAI endpoint OPENAI_BASE_URL=https://api.openai.com/v1 + +# Tip: Include /v1 exactly once and avoid a trailing slash.
314-377: Azure note: api-version requirementMost Azure OpenAI endpoints require an api-version query param. Add a reminder so users don’t get 400s.
OPENAI_BASE_URL=https://your-resource.openai.azure.com/openai/deployments/your-deployment MODEL_CHOICE=gpt-4o-mini OPENAI_API_KEY=your-azure-api-key + +# Note: Azure OpenAI requires an `api-version` query parameter. Ensure your proxy or client adds it, +# e.g., `...?api-version=2024-07-01-preview`.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (6)
docs/docs/api-reference.mdx(2 hunks)docs/docs/configuration.mdx(1 hunks)docs/docs/rag.mdx(2 hunks)python/src/agents/agent_provider_config.py(1 hunks)python/src/agents/base_agent.py(4 hunks)python/src/server/services/llm_provider_service.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
python/src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
python/src/**/*.py: Fail fast on service startup failures (crash with clear error if credentials, database, or any service cannot initialize)
Fail fast on missing configuration or invalid environment settings
Fail fast on database connection failures; do not hide connection issues
Fail fast on authentication/authorization failures; halt the operation and surface the error
Fail fast on data corruption or validation errors; let Pydantic raise
Fail fast when critical dependencies are unavailable (required service down)
Never store invalid data that would corrupt state (e.g., zero embeddings, null foreign keys, malformed JSON); fail instead
For batch processing, complete what you can and log detailed failures per item
Background tasks should finish queues but log failures clearly
Do not crash on a single WebSocket/event failure; log and continue serving other clients
If optional features are disabled, log and skip rather than crashing
External API calls should retry with exponential backoff; then fail with a clear, specific error
When continuing after a failure, skip the failed item entirely; never persist partial or corrupted results
Include context about the attempted operation in error messages
Preserve full stack traces with exc_info=True in Python logging
Use specific exception types; avoid catching generic Exception
Never return None to indicate failure; raise an exception with details
For batch operations, report both success counts and detailed failure lists
Target Python 3.12 and keep line length at 120 characters
Use Ruff for linting (errors, warnings, unused imports, style) and keep code Ruff-clean
Use Mypy for static type checking and keep code type-safe
Enable auto-formatting on save in IDEs to maintain consistent Python style
Files:
python/src/server/services/llm_provider_service.pypython/src/agents/agent_provider_config.pypython/src/agents/base_agent.py
python/src/server/**
📄 CodeRabbit inference engine (CLAUDE.md)
Keep the main FastAPI application under python/src/server/
Files:
python/src/server/services/llm_provider_service.py
python/src/server/services/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Put backend business logic services under python/src/server/services/
Files:
python/src/server/services/llm_provider_service.py
python/src/agents/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Place PydanticAI agent implementations under python/src/agents/
Files:
python/src/agents/agent_provider_config.pypython/src/agents/base_agent.py
🧬 Code graph analysis (3)
python/src/server/services/llm_provider_service.py (1)
python/src/agents/agent_provider_config.py (1)
get_openai_client_config(178-239)
python/src/agents/agent_provider_config.py (1)
python/src/server/services/credential_service.py (3)
get_credentials_by_category(277-323)get_credential(158-175)get_credential(485-487)
python/src/agents/base_agent.py (3)
python/src/agents/rag_agent.py (1)
_create_agent(71-317)python/src/agents/document_agent.py (1)
_create_agent(73-663)python/src/agents/agent_provider_config.py (1)
get_configured_openai_model(65-122)
🪛 LanguageTool
docs/docs/api-reference.mdx
[grammar] ~1481-~1481: There might be a mistake here.
Context: ...M, Azure OpenAI) - applies system-wide | | EMBEDDING_MODEL | string | Override ...
(QB_NEW_EN)
docs/docs/configuration.mdx
[grammar] ~151-~151: There might be a mistake here.
Context: ...ute all OpenAI API requests through: - LiteLLM proxy for multi-provider acces...
(QB_NEW_EN)
[grammar] ~152-~152: There might be a mistake here.
Context: ...iteLLM proxy** for multi-provider access - Azure OpenAI endpoints - **Corporate p...
(QB_NEW_EN)
[grammar] ~153-~153: There might be a mistake here.
Context: ...ider access - Azure OpenAI endpoints - Corporate proxies or custom gateways -...
(QB_NEW_EN)
[grammar] ~154-~154: There might be a mistake here.
Context: ...Corporate proxies or custom gateways - Local OpenAI-compatible servers <Admo...
(QB_NEW_EN)
[grammar] ~172-~172: There might be a mistake here.
Context: ...penAI Base URL** (optional) 4. Examples: - LiteLLM: http://localhost:8000/v1 -...
(QB_NEW_EN)
[grammar] ~173-~173: There might be a mistake here.
Context: ...* (optional) 4. Examples: - LiteLLM: http://localhost:8000/v1 - Azure OpenAI: `https://your-resource.ope...
(QB_NEW_EN)
[grammar] ~174-~174: There might be a mistake here.
Context: ...//localhost:8000/v1 - Azure OpenAI:https://your-resource.openai.azure.com/openai/deployments/your-deployment` - Custom proxy: `https://api.yourcompany.c...
(QB_NEW_EN)
docs/docs/rag.mdx
[grammar] ~439-~439: There might be a mistake here.
Context: ... Security Considerations for Proxy Usage When configuring custom OpenAI base URLs...
(QB_NEW_EN)
[grammar] ~456-~456: There might be a mistake here.
Context: ...lopment only ``` ### API Key Management - Separate Keys: Use different API keys ...
(QB_NEW_EN)
[grammar] ~462-~462: There might be a mistake here.
Context: ...quired permissions ### Proxy Validation Before deploying with a custom proxy: 1....
(QB_NEW_EN)
[grammar] ~463-~463: There might be a mistake here.
Context: ...on Before deploying with a custom proxy: 1. Verify SSL/TLS certificates for HTTPS ...
(QB_NEW_EN)
🔇 Additional comments (1)
docs/docs/api-reference.mdx (1)
1469-1472: OPENAI_BASE_URL example looks goodExample value includes the expected /v1 suffix and is consistent with the new setting.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
- Add exc_info=True to error logging for better observability - Use proper logging format strings instead of f-strings - Remove unused exception variable after switching to exc_info - Provides full stack trace for debugging provider creation failures
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
docs/docs/rag.mdx (1)
439-445: Nice: explicit “No Secrets in Logs” guidance addedThis closes the prior security doc gap.
🧹 Nitpick comments (2)
docs/docs/api-reference.mdx (1)
1479-1483: Clarify URL format and precedence to avoid misconfigurationSpecify that OPENAI_BASE_URL must include /v1 and that it overrides LLM_BASE_URL when LLM_PROVIDER=openai.
Apply this diff:
-| `LLM_BASE_URL` | string | Custom base URL (required for Ollama) | -| `OPENAI_BASE_URL` | string | Custom OpenAI endpoint for proxies (LiteLLM, Azure OpenAI) — applies system-wide | +| `LLM_BASE_URL` | string | Custom base URL (required for Ollama). Ignored when `LLM_PROVIDER=openai` and `OPENAI_BASE_URL` is set. | +| `OPENAI_BASE_URL` | string | Custom OpenAI‑compatible base URL (include `/v1`, avoid trailing slash) — applies system‑wide and takes precedence over `LLM_BASE_URL` when `LLM_PROVIDER=openai`. |docs/docs/rag.mdx (1)
306-312: Example should illustrate a non-default endpointShow a proxy URL to reinforce “optional custom endpoint,” and note the default.
Apply this diff:
-# Optional: Custom OpenAI endpoint -OPENAI_BASE_URL=https://api.openai.com/v1 +# Optional: Custom OpenAI endpoint +# Default is https://api.openai.com/v1 — set this only when routing via a proxy +OPENAI_BASE_URL=http://localhost:8000/v1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (4)
docs/docs/api-reference.mdx(2 hunks)docs/docs/rag.mdx(2 hunks)python/src/agents/agent_provider_config.py(1 hunks)python/src/server/services/llm_provider_service.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- python/src/server/services/llm_provider_service.py
- python/src/agents/agent_provider_config.py
🧰 Additional context used
🪛 LanguageTool
docs/docs/rag.mdx
[grammar] ~439-~439: There might be a mistake here.
Context: ... models (Ollama) ### API Key Management - Separate Keys: Use different API keys ...
(QB_NEW_EN)
docs/docs/api-reference.mdx
[grammar] ~1479-~1479: There might be a mistake here.
Context: ...r choice: openai, ollama, google | | LLM_BASE_URL | string | Custom b...
(QB_NEW_EN)
[grammar] ~1480-~1480: There might be a mistake here.
Context: ... Custom base URL (required for Ollama) | | OPENAI_BASE_URL | string | Custom O...
(QB_NEW_EN)
[grammar] ~1481-~1481: There might be a mistake here.
Context: ...M, Azure OpenAI) — applies system-wide | | EMBEDDING_MODEL | string | Override...
(QB_NEW_EN)
[grammar] ~1482-~1482: There might be a mistake here.
Context: ...ing | Override default embedding model | | MODEL_CHOICE | string | Chat mod...
(QB_NEW_EN)
[grammar] ~1483-~1483: There might be a mistake here.
Context: ...or summaries and contextual embeddings | ## 💬 Agent Chat API <Admonition type="war...
(QB_NEW_EN)
🔇 Additional comments (1)
docs/docs/api-reference.mdx (1)
1469-1473: LGTM: example now surfaces OPENAI_BASE_URL in RAG settings payloadMatches backend/UI behavior; no issues spotted.
- Add custom exception classes: RateLimitExceededError and AgentExecutionError - Update rate limit error handling to use specific exception type - Improve agent execution error logging with exc_info for better observability - Make agent property safe with proper None checking and clear error message - Use proper exception chaining and avoid bare except clauses - Enhance debugging capabilities with better error context and stack traces
- Add ipaddress import for robust IP address validation - Expand private host detection to include RFC1918 private networks: - IPv4: 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16 - IPv6: loopback (::1) and unique local addresses (fc00::/7) - Include Docker-specific hostnames (host.docker.internal) - Reduce noisy warnings for legitimate corporate proxy deployments - Use ipaddress.is_loopback and is_private for accurate detection - Maintain security warnings for genuinely public HTTP endpoints This prevents false warnings for internal development environments, Docker deployments, and corporate networks while still alerting for potentially insecure public HTTP endpoints.
- Add environment variable configuration for OpenAI client behavior: - OPENAI_TIMEOUT_SECONDS (default: 60) - Request timeout in seconds - OPENAI_MAX_RETRIES (default: 5) - Maximum retry attempts - Apply consistent timeout and retry configuration across all providers: - OpenAI (including custom base URLs) - Ollama (OpenAI-compatible client) - Google Gemini (OpenAI-compatible client) - Improve reliability and control over API behavior - Allow environment-specific tuning for production vs development - Aligns with backoff guideline by setting defaults at client construction - Downstream calls inherit retry/timeout behavior automatically
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
|
Thanks for this @orestesgarcia! This is one of the high priority items for us. We have an Ollama integration coming in from @tazmon95 soon and will touch some of the same things, so I think the plan is to bring that in and then start bringing in some other crawling/settings/provider PRs like this awesome one! |
|
@orestesgarcia ollama integration and some other big architecture changes are now in place, so this PR needs some resolving/rebasing. |
Production merge including: - PR coleam00#583: Official Supabase integration (13-service stack) - PR coleam00#584: Audit v2 findings (healthchecks, dependencies, error handling) - PR coleam00#585: Hybrid NetworkPolicy (explicit external API allow-list) - GPU Orchestrator security hardening (no-new-privileges, cap_drop) - CHIT security documentation - Service dependencies with healthchecks - Agent Zero healthcheck 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> # Conflicts: # docs/PMOVES.AI-Edition-Hardened-Full.md # pmoves/Makefile # pmoves/__init__.py # pmoves/docker-compose.external.yml # pmoves/docker-compose.yml # pmoves/env.shared.example # pmoves/env.tier-media
* Fix: isolation resolver swallows errors and leaks partial state (#585) Programming bugs in provider.create() were silently converted to user-facing "workspace blocked" messages, making them invisible. A successful worktree creation followed by a failed DB insert would also orphan the worktree on disk with no cleanup. Changes: - Add isKnownIsolationError() to errors.ts to distinguish known infrastructure errors (permission denied, timeout, no space, not-a-git-repo) from unknown errors (programming bugs, unexpected failures) - Split createNewEnvironment try/catch: unknown errors now propagate as crashes instead of silently becoming 'blocked'; known errors still return 'blocked' - Add orphan cleanup in createNewEnvironment: if store.create() fails after worktree creation, destroy() is called on the orphaned worktree before rethrowing the store error - Add errorType field to markDestroyedBestEffort error log per codebase logging conventions - Change copyConfiguredFiles to return { configLoadFailed } flag instead of silently swallowing config load errors; propagate warning through createWorktree and create() into IsolationResolution.warnings - Add warnings?: string[] to IsolatedEnvironmentBase and IsolationResolution resolved variant to surface non-fatal issues to callers Fixes #585 * fix: address review findings from PR #597 - Add 'branch not found' entry to classifyIsolationError to close gap with isKnownIsolationError - Surface IsolationResolution.warnings to user via platform.sendMessage in orchestrator resolved case - Add tests for warnings propagation through resolver (with and without warnings) - Remove unnecessary type cast in destroy mock — options?.force is already typed correctly - Add errorType to copyConfiguredFiles catch block for log query consistency - Update copyConfiguredFiles JSDoc to accurately describe when configLoadFailed is set - Update stale inline comment to mention configLoadFailed flag
…am00#597) * Fix: isolation resolver swallows errors and leaks partial state (coleam00#585) Programming bugs in provider.create() were silently converted to user-facing "workspace blocked" messages, making them invisible. A successful worktree creation followed by a failed DB insert would also orphan the worktree on disk with no cleanup. Changes: - Add isKnownIsolationError() to errors.ts to distinguish known infrastructure errors (permission denied, timeout, no space, not-a-git-repo) from unknown errors (programming bugs, unexpected failures) - Split createNewEnvironment try/catch: unknown errors now propagate as crashes instead of silently becoming 'blocked'; known errors still return 'blocked' - Add orphan cleanup in createNewEnvironment: if store.create() fails after worktree creation, destroy() is called on the orphaned worktree before rethrowing the store error - Add errorType field to markDestroyedBestEffort error log per codebase logging conventions - Change copyConfiguredFiles to return { configLoadFailed } flag instead of silently swallowing config load errors; propagate warning through createWorktree and create() into IsolationResolution.warnings - Add warnings?: string[] to IsolatedEnvironmentBase and IsolationResolution resolved variant to surface non-fatal issues to callers Fixes coleam00#585 * fix: address review findings from PR coleam00#597 - Add 'branch not found' entry to classifyIsolationError to close gap with isKnownIsolationError - Surface IsolationResolution.warnings to user via platform.sendMessage in orchestrator resolved case - Add tests for warnings propagation through resolver (with and without warnings) - Remove unnecessary type cast in destroy mock — options?.force is already typed correctly - Add errorType to copyConfiguredFiles catch block for log query consistency - Update copyConfiguredFiles JSDoc to accurately describe when configLoadFailed is set - Update stale inline comment to mention configLoadFailed flag
…am00#597) * Fix: isolation resolver swallows errors and leaks partial state (coleam00#585) Programming bugs in provider.create() were silently converted to user-facing "workspace blocked" messages, making them invisible. A successful worktree creation followed by a failed DB insert would also orphan the worktree on disk with no cleanup. Changes: - Add isKnownIsolationError() to errors.ts to distinguish known infrastructure errors (permission denied, timeout, no space, not-a-git-repo) from unknown errors (programming bugs, unexpected failures) - Split createNewEnvironment try/catch: unknown errors now propagate as crashes instead of silently becoming 'blocked'; known errors still return 'blocked' - Add orphan cleanup in createNewEnvironment: if store.create() fails after worktree creation, destroy() is called on the orphaned worktree before rethrowing the store error - Add errorType field to markDestroyedBestEffort error log per codebase logging conventions - Change copyConfiguredFiles to return { configLoadFailed } flag instead of silently swallowing config load errors; propagate warning through createWorktree and create() into IsolationResolution.warnings - Add warnings?: string[] to IsolatedEnvironmentBase and IsolationResolution resolved variant to surface non-fatal issues to callers Fixes coleam00#585 * fix: address review findings from PR coleam00#597 - Add 'branch not found' entry to classifyIsolationError to close gap with isKnownIsolationError - Surface IsolationResolution.warnings to user via platform.sendMessage in orchestrator resolved case - Add tests for warnings propagation through resolver (with and without warnings) - Remove unnecessary type cast in destroy mock — options?.force is already typed correctly - Add errorType to copyConfiguredFiles catch block for log query consistency - Update copyConfiguredFiles JSDoc to accurately describe when configLoadFailed is set - Update stale inline comment to mention configLoadFailed flag
This commit implements comprehensive support for custom OpenAI base URL configuration throughout the entire Archon system, enabling users to route OpenAI API calls through proxies like LiteLLM, Azure OpenAI, or other OpenAI-compatible endpoints.
Key Changes:
Backend Services
PydanticAI Agent Integration
Frontend UI
Features:
Testing:
Resolves: #584
Pull Request
Summary
This PR adds comprehensive support for custom OpenAI base URL configuration throughout the entire Archon system. This enables users to route OpenAI API calls through proxies like LiteLLM, Azure OpenAI, or other OpenAI-compatible endpoints, providing much-needed flexibility for enterprise deployments and cost optimization.
Key motivation: Enable access to OpenAI API through proxy services like LiteLLM for better cost management, rate limiting, and provider switching capabilities.
Changes Made
Backend Services
credential_service.py: AddedOPENAI_BASE_URLsupport in_get_provider_base_url()methodllm_provider_service.py: Modified OpenAI client creation to accept optionalbase_urlparameter for AsyncOpenAI clientscode_storage_service.py: Updated synchronous OpenAI client to support custombase_urlfor consistencyPydanticAI Agent Integration
agent_provider_config.py: New centralized module for configuring PydanticAI OpenAI providers with custom endpointsbase_agent.py: Added async initialization support and_get_configured_model()method for custom provider configurationrag_agent.pyanddocument_agent.py: Modified to use custom OpenAI providers when base URL is configuredserver.py: Modified streaming endpoint to handle async agent initializationFrontend UI
RAGSettings.tsx: Added conditional OpenAI Base URL input field that appears only when OpenAI is selected as providerOPENAI_BASE_URL?: stringproperty to RAGSettingsPropsType of Change
Affected Services
Testing
Test Evidence
Checklist
Breaking Changes
None - This is a fully backwards-compatible change. Existing deployments will continue to work exactly as before if no
OPENAI_BASE_URLis configured.Additional Notes
Configuration Examples
LiteLLM Proxy:
# In Settings UI or environment OPENAI_BASE_URL=http://localhost:4000/v1Azure OpenAI:
Corporate Proxy:
Implementation Highlights
Consistent Coverage: All OpenAI usage points now support custom base URLs:
Smart Fallbacks: When custom base URL is configured but fails, the system gracefully falls back to default OpenAI endpoints with appropriate logging
UI Integration: The base URL field only appears when OpenAI is selected as the provider, maintaining a clean user interface
Security: Custom base URLs are stored encrypted in the database alongside other credentials
Files Changed (9 total)
python/src/server/services/credential_service.pypython/src/server/services/llm_provider_service.pypython/src/server/services/storage/code_storage_service.pypython/src/agents/agent_provider_config.py(new file)python/src/agents/base_agent.pypython/src/agents/rag_agent.pypython/src/agents/document_agent.pypython/src/agents/server.pyarchon-ui-main/src/components/settings/RAGSettings.tsxResolves: #584
Summary by CodeRabbit
New Features
Documentation