Conversation
…and observer cleanup
…-deps, and update docs
…ent thread pool hang
…check to satisfy Ruff TRY300
…ifespan, and move asyncio import to stdlib group
…ion logging feedback
…nd update UI/README
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
📝 WalkthroughWalkthrough문서에 배포 URL 추가, HuggingFace 임베딩을 Inference API로 전환하고 API 토큰 설정 추가, 스트리밍에 타임아웃·청크·클라이언트 연결 해제 처리 및 폴백 추가, 프롬프트 보안 강화, 임포트 재배치 및 관련 테스트 목 경로 업데이트가 적용되었습니다. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client
participant Route as ChatRoute
participant LLM as LLM Service
participant Emb as Embedding Service
participant DB as VectorStore/DB
Client->>Route: POST /chat (stream)
Note right of Route: 로컬 임포트 후 실행 시작
Route->>Emb: agenerate_embedding(query) (await)
Emb-->>Route: embedding
Route->>DB: similarity search (await)
DB-->>Route: docs
Route->>LLM: get_response_stream_async(prompt, docs) (stream)
alt stream chunks
LLM-->>Route: chunk(n)
Route->>Client: SSE/stream chunk(n)
else zero chunks
Route->>Client: fallback content message
end
Note over Route,Client: 클라이언트 연결 끊김 감지 시 스트리밍 중단
Estimated code review effort🎯 3 (보통) | ⏱️ ~25분 Possibly related PRs
시
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
backend/app/services/llm.py (1)
126-135: async generator 정리를 finally 블록에서 보장하세요.Line 126~135의 코드에서
TimeoutError발생 시generator가 명시적으로 닫히지 않아 스트림 리소스가 누수될 수 있습니다.finally블록을 추가하여await generator.aclose()를 반드시 호출하고,print()대신 로거를 사용하여 프로젝트의 로깅 패턴과 일치시키세요.♻️ 제안 패치
async def get_response_stream_async(context: str, query: str, history: str = ""): """ Returns an async stream of strings from the LLM. """ prompt = get_rag_prompt() chain = prompt | get_llm() | StrOutputParser() generator = chain.astream({"context": context, "chat_history": history, "query": query}) + try: - while True: + while True: - try: + try: - chunk = await asyncio.wait_for(generator.__anext__(), timeout=30.0) + chunk = await asyncio.wait_for(generator.__anext__(), timeout=30.0) - yield chunk + yield chunk - except StopAsyncIteration: + except StopAsyncIteration: - break + break - except asyncio.TimeoutError: + except asyncio.TimeoutError: - print("LLM stream chunk timed out after 30 seconds.") + logger.warning("LLM stream chunk timed out after 30 seconds.") - raise + raise + finally: + await generator.aclose()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/services/llm.py` around lines 126 - 135, Ensure the async generator returned by chain.astream is always closed: wrap the loop that awaits generator.__anext__ via asyncio.wait_for in try/except/finally and in the finally call await generator.aclose() to guarantee cleanup on TimeoutError or other exits; also replace the print("LLM stream chunk timed out after 30 seconds.") with the project logger (e.g., logger.error or similar) to follow logging conventions and include contextual information (context, query, etc.) when logging the timeout.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/app/api/routes/chat.py`:
- Around line 130-143: The current loop breaks on client disconnect
(request.is_disconnected()) but still falls through to the post-loop fallback
that yields when chunk_count == 0; change the behavior so a client disconnect
short-circuits the generator: either replace the break with an immediate return
from the generator or set a flag (e.g., disconnected = True) when
request.is_disconnected() is true and check that flag before executing the
chunk_count == 0 fallback yield; update get_response_stream_async handling
accordingly so no fallback yield is sent after a disconnect.
In `@backend/app/main.py`:
- Around line 52-53: The except block binds an unused exception variable `e`;
change the exception handler from `except Exception as e:` to `except
Exception:` in the shutdown/preload wait code so the unused-variable warning
(F841) is removed while keeping the existing `logger.exception("Exception
occurred while waiting for preload task during shutdown.")` call unchanged
(locate the handler where `logger.exception` is invoked in main.py).
---
Nitpick comments:
In `@backend/app/services/llm.py`:
- Around line 126-135: Ensure the async generator returned by chain.astream is
always closed: wrap the loop that awaits generator.__anext__ via
asyncio.wait_for in try/except/finally and in the finally call await
generator.aclose() to guarantee cleanup on TimeoutError or other exits; also
replace the print("LLM stream chunk timed out after 30 seconds.") with the
project logger (e.g., logger.error or similar) to follow logging conventions and
include contextual information (context, query, etc.) when logging the timeout.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 4db08edd-431e-44c0-85b4-35eb91bcb224
📒 Files selected for processing (10)
README.mdbackend/app/api/routes/chat.pybackend/app/core/config.pybackend/app/main.pybackend/app/services/embedding.pybackend/app/services/llm.pybackend/requirements.txtbackend/tests/e2e/test_chat_endpoint.pybackend/tests/integration/test_supabase_match.pyfrontend/components/chat/ChatMain.tsx
….py fallbacks, and unused vars
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
backend/app/main.py (1)
9-15:asyncio중복 import를 정리해 주세요.Line [9]와 Line [15]에서 동일 모듈을 중복 import하고 있습니다. 한 줄만 남겨 주세요.
패치 제안
from app.api.routes import chat from app.core.rate_limit import limiter import asyncio from contextlib import asynccontextmanager import logging logger = logging.getLogger(__name__) - -import asyncio🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/main.py` around lines 9 - 15, 파일 상단에 asyncio 모듈이 중복으로 import 되어 있으니 중복된 import 문 중 하나를 제거하여 하나의 import asyncio만 남기세요; 대상 식별자는 모듈명 asyncio와 파일의 상단 import 블록(현재 import asyncio가 두 번 선언된 위치)입니다.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/app/main.py`:
- Around line 94-96: The readiness_check function currently uses a broad except
Exception as e: which triggers BLE001; to make the intent explicit leave the
broad catch but annotate it with the lint-ignore and rationale by changing the
except to include a trailing comment (e.g., add " # noqa: BLE001 - Service
boundary layer catches all preload failures") next to the except Exception as
e:, keeping the logger.warning("Preload task failed during readiness check: %s",
e) and the JSONResponse({"status": "failed"}, status_code=503) behavior
unchanged so the service boundary continues to convert any preload
initialization error into a 503 while suppressing the BLE001 warning.
In `@backend/app/services/llm.py`:
- Line 137: The warning currently logs the raw user input via
logger.warning(f"... Query: {query}"); change it to avoid storing PII by logging
only non-identifying info (e.g., a short hash or metadata). Replace the f-string
that references query with a redacted identifier: compute a short SHA-256 (or
similar) of query (using hashlib.sha256(query.encode('utf-8')).hexdigest()[:8])
or log query length/placeholder, and include that hash/id in the logger.warning
call instead of the raw query; ensure you add the hashlib import if you choose
hashing and keep references to the same logger.warning and query variable names.
---
Nitpick comments:
In `@backend/app/main.py`:
- Around line 9-15: 파일 상단에 asyncio 모듈이 중복으로 import 되어 있으니 중복된 import 문 중 하나를
제거하여 하나의 import asyncio만 남기세요; 대상 식별자는 모듈명 asyncio와 파일의 상단 import 블록(현재 import
asyncio가 두 번 선언된 위치)입니다.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 54e9c16e-f904-4359-b54f-fc238f8cc7d8
📒 Files selected for processing (3)
backend/app/api/routes/chat.pybackend/app/main.pybackend/app/services/llm.py
| except Exception as e: | ||
| logger.warning("Preload task failed during readiness check: %s", e) | ||
| return JSONResponse({"status": "failed"}, status_code=503) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# 1) main.py에서 preload/readiness 예외 처리 위치 확인
rg -n -C3 'def preload_models|def readiness_check|preload_task.result|except Exception as e|except Exception:' backend/app/main.py
# 2) preload 체인 대상 파일 확인
fd 'embedding.py|llm.py' backend/app/services
# 3) 관련 서비스의 명시적 raise/except 타입 확인
rg -n -C2 'raise\s+' backend/app/services/ 2>/dev/null | head -100Repository: SanghunYun95/philo-rag
Length of output: 2990
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check embedding service - embeddings property
rg -n -A20 '@property' backend/app/services/embedding.py | grep -A20 'embeddings'
# Check get_llm function implementation
rg -n -A15 'def get_llm' backend/app/services/llm.py
# Check imports and other potential exceptions
head -50 backend/app/services/embedding.py
head -50 backend/app/services/llm.pyRepository: SanghunYun95/philo-rag
Length of output: 5774
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check if there are try-catch blocks or exception handling patterns in the service files
rg -n 'except|raise' backend/app/services/embedding.py backend/app/services/llm.py
# Check the full get_llm implementation to see if other exceptions could occur
tail -20 backend/app/services/llm.py
# Check preload_models implementation again to understand full context
sed -n '19,45p' backend/app/main.pyRepository: SanghunYun95/philo-rag
Length of output: 2005
readiness_check의 광범위 예외 처리를 명확히 해 주세요.
Line 94의 except Exception as e:는 BLE001 경고 대상입니다. 이는 서비스 경계층의 준비성 검사 엔드포인트에서 preload 작업의 모든 실패를 포착하여 503 상태 반환하는 의도된 패턴입니다. HuggingFace와 Google Generative AI 라이브러리 초기화 시 RuntimeError/ValueError 외 다양한 예외가 발생할 수 있으므로, 예외 범위를 좁히기보다는 의도를 명확히 하기 위해 주석과 함께 # noqa: BLE001을 추가하는 것이 권장됩니다:
except Exception as e: # noqa: BLE001 - Service boundary layer catches all preload failures🧰 Tools
🪛 Ruff (0.15.2)
[warning] 94-94: Do not catch blind exception: Exception
(BLE001)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/app/main.py` around lines 94 - 96, The readiness_check function
currently uses a broad except Exception as e: which triggers BLE001; to make the
intent explicit leave the broad catch but annotate it with the lint-ignore and
rationale by changing the except to include a trailing comment (e.g., add " #
noqa: BLE001 - Service boundary layer catches all preload failures") next to the
except Exception as e:, keeping the logger.warning("Preload task failed during
readiness check: %s", e) and the JSONResponse({"status": "failed"},
status_code=503) behavior unchanged so the service boundary continues to convert
any preload initialization error into a 503 while suppressing the BLE001
warning.
| except asyncio.TimeoutError: | ||
| import logging | ||
| logger = logging.getLogger(__name__) | ||
| logger.warning(f"LLM stream chunk timed out after 30 seconds. Query: {query}") |
There was a problem hiding this comment.
타임아웃 로그에 사용자 질의 원문을 남기지 마세요.
Line 137은 사용자 입력을 그대로 로그에 기록하고 있어 운영 로그에 민감정보가 저장될 수 있습니다. 원문 대신 비식별 정보만 남기도록 바꿔주세요.
🔧 수정 예시
- logger.warning(f"LLM stream chunk timed out after 30 seconds. Query: {query}")
+ logger.warning("LLM stream chunk timed out after 30 seconds")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| logger.warning(f"LLM stream chunk timed out after 30 seconds. Query: {query}") | |
| logger.warning("LLM stream chunk timed out after 30 seconds") |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/app/services/llm.py` at line 137, The warning currently logs the raw
user input via logger.warning(f"... Query: {query}"); change it to avoid storing
PII by logging only non-identifying info (e.g., a short hash or metadata).
Replace the f-string that references query with a redacted identifier: compute a
short SHA-256 (or similar) of query (using
hashlib.sha256(query.encode('utf-8')).hexdigest()[:8]) or log query
length/placeholder, and include that hash/id in the logger.warning call instead
of the raw query; ensure you add the hashlib import if you choose hashing and
keep references to the same logger.warning and query variable names.
Summary by CodeRabbit
새로운 기능
버그 수정
문서
사용자 인터페이스