Skip to content

Feat/fix unanswered queries#19

Merged
SanghunYun95 merged 48 commits intomainfrom
feat/fix-unanswered-queries
Mar 6, 2026
Merged

Feat/fix unanswered queries#19
SanghunYun95 merged 48 commits intomainfrom
feat/fix-unanswered-queries

Conversation

@SanghunYun95
Copy link
Copy Markdown
Owner

@SanghunYun95 SanghunYun95 commented Mar 6, 2026

Summary by CodeRabbit

  • 새로운 기능

    • 프롬프트 인젝션 방어 강화
  • 버그 수정

    • 스트리밍 응답이 없을 때 폴백 메시지 제공
    • 스트리밍 타임아웃 및 연결 끊김 처리 개선
  • 문서

    • 배포된 사이트 URL 추가
  • 사용자 인터페이스

    • 채팅 화면에서 내보내기 버튼 제거

…ifespan, and move asyncio import to stdlib group
@vercel
Copy link
Copy Markdown

vercel bot commented Mar 6, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
philo-rag Ready Ready Preview, Comment Mar 6, 2026 0:19am

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 6, 2026

📝 Walkthrough

Walkthrough

문서에 배포 URL 추가, HuggingFace 임베딩을 Inference API로 전환하고 API 토큰 설정 추가, 스트리밍에 타임아웃·청크·클라이언트 연결 해제 처리 및 폴백 추가, 프롬프트 보안 강화, 임포트 재배치 및 관련 테스트 목 경로 업데이트가 적용되었습니다.

Changes

Cohort / File(s) Summary
문서 및 설정
README.md, backend/app/core/config.py
배포된 사이트 URL 추가 및 HUGGINGFACEHUB_API_TOKEN 설정 필드 추가.
임베딩 서비스 변경
backend/app/services/embedding.py, backend/requirements.txt
로컬 HuggingFace 임베딩 → HuggingFaceEndpointEmbeddings(Inference API) 전환, API 토큰 사용 및 경고 로깅, langchain-huggingface 의존성 추가.
LLM 스트리밍 및 프롬프트
backend/app/services/llm.py
RAG 프롬프트에 보안(프롬프트 인젝션 거부) 지시문 추가; 비동기 스트리밍을 asyncio.wait_for(__anext__(), 30s) 루프로 변경해 타임아웃 처리 및 안전한 종료 구현.
채팅 라우트 변경
backend/app/api/routes/chat.py
임포트를 함수 로컬로 이동, 스트리밍에서 chunk_count 추적 및 빈 응답 폴백 추가, 클라이언트 연결 해제 감지 후 스트리밍 중단 로직 추가.
앱 시작/종료 로깅
backend/app/main.py
프리로드 태스크 결과 처리 및 예외/취소 경로에 대한 로깅 강화 및 readiness 체크 예외 처리 명확화.
테스트 업데이트
backend/tests/e2e/test_chat_endpoint.py, backend/tests/integration/test_supabase_match.py
테스트의 모의 대상 경로를 서비스 모듈로 이동 (app.services.embedding.EmbeddingService.agenerate_embedding, app.services.llm.get_english_translation, app.services.llm.get_response_stream_async).
프론트엔드 UI
frontend/components/chat/ChatMain.tsx
상단 컨트롤 그룹에서 내보내기/공유 버튼 제거(표시 요소 삭제).

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client
    participant Route as ChatRoute
    participant LLM as LLM Service
    participant Emb as Embedding Service
    participant DB as VectorStore/DB

    Client->>Route: POST /chat (stream)
    Note right of Route: 로컬 임포트 후 실행 시작
    Route->>Emb: agenerate_embedding(query) (await)
    Emb-->>Route: embedding
    Route->>DB: similarity search (await)
    DB-->>Route: docs
    Route->>LLM: get_response_stream_async(prompt, docs) (stream)
    alt stream chunks
        LLM-->>Route: chunk(n)
        Route->>Client: SSE/stream chunk(n)
    else zero chunks
        Route->>Client: fallback content message
    end
    Note over Route,Client: 클라이언트 연결 끊김 감지 시 스트리밍 중단
Loading

Estimated code review effort

🎯 3 (보통) | ⏱️ ~25분

Possibly related PRs

  • Feat/fix unanswered queries #18 — 동일한 변경 범위(임베딩 서비스 전환, chat.py 임포트 이동, 스트리밍/프롬프트 변경)와 직접 코드 겹침.
  • Feat/book metadata #17 — chat.py의 서비스 임포트 로컬 전환 및 main.py 관련 변경을 공유.
  • Feat/book metadata #16 — 채팅 스트리밍/임베딩 호출 방식 관련 변경(타임아웃·폴백·임포트 재배치)과 연관.

🐰 임베딩은 먼 길 건너왔네,
스트림엔 타임아웃 깃발 꽂고,
토큰 한 알로 문서를 지키며,
테스트는 새 길 따라 춤추네,
배포 URL은 자랑처럼 빛나지. ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed PR 제목이 변경 사항의 주요 내용과 부분적으로만 관련되어 있습니다. '답변되지 않은 쿼리 수정'이라는 제목은 실제 변경 사항(임베딩 API 변경, 스트리밍 개선, 설정 추가 등)의 일부 측면만 반영합니다.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/fix-unanswered-queries

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
backend/app/services/llm.py (1)

126-135: async generator 정리를 finally 블록에서 보장하세요.

Line 126~135의 코드에서 TimeoutError 발생 시 generator가 명시적으로 닫히지 않아 스트림 리소스가 누수될 수 있습니다. finally 블록을 추가하여 await generator.aclose()를 반드시 호출하고, print() 대신 로거를 사용하여 프로젝트의 로깅 패턴과 일치시키세요.

♻️ 제안 패치
 async def get_response_stream_async(context: str, query: str, history: str = ""):
     """
     Returns an async stream of strings from the LLM.
     """
     prompt = get_rag_prompt()
     chain = prompt | get_llm() | StrOutputParser()
     generator = chain.astream({"context": context, "chat_history": history, "query": query})
+    try:
-    while True:
+        while True:
-        try:
+            try:
-            chunk = await asyncio.wait_for(generator.__anext__(), timeout=30.0)
+                chunk = await asyncio.wait_for(generator.__anext__(), timeout=30.0)
-            yield chunk
+                yield chunk
-        except StopAsyncIteration:
+            except StopAsyncIteration:
-            break
+                break
-        except asyncio.TimeoutError:
+    except asyncio.TimeoutError:
-            print("LLM stream chunk timed out after 30 seconds.")
+        logger.warning("LLM stream chunk timed out after 30 seconds.")
-            raise
+        raise
+    finally:
+        await generator.aclose()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/services/llm.py` around lines 126 - 135, Ensure the async
generator returned by chain.astream is always closed: wrap the loop that awaits
generator.__anext__ via asyncio.wait_for in try/except/finally and in the
finally call await generator.aclose() to guarantee cleanup on TimeoutError or
other exits; also replace the print("LLM stream chunk timed out after 30
seconds.") with the project logger (e.g., logger.error or similar) to follow
logging conventions and include contextual information (context, query, etc.)
when logging the timeout.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/app/api/routes/chat.py`:
- Around line 130-143: The current loop breaks on client disconnect
(request.is_disconnected()) but still falls through to the post-loop fallback
that yields when chunk_count == 0; change the behavior so a client disconnect
short-circuits the generator: either replace the break with an immediate return
from the generator or set a flag (e.g., disconnected = True) when
request.is_disconnected() is true and check that flag before executing the
chunk_count == 0 fallback yield; update get_response_stream_async handling
accordingly so no fallback yield is sent after a disconnect.

In `@backend/app/main.py`:
- Around line 52-53: The except block binds an unused exception variable `e`;
change the exception handler from `except Exception as e:` to `except
Exception:` in the shutdown/preload wait code so the unused-variable warning
(F841) is removed while keeping the existing `logger.exception("Exception
occurred while waiting for preload task during shutdown.")` call unchanged
(locate the handler where `logger.exception` is invoked in main.py).

---

Nitpick comments:
In `@backend/app/services/llm.py`:
- Around line 126-135: Ensure the async generator returned by chain.astream is
always closed: wrap the loop that awaits generator.__anext__ via
asyncio.wait_for in try/except/finally and in the finally call await
generator.aclose() to guarantee cleanup on TimeoutError or other exits; also
replace the print("LLM stream chunk timed out after 30 seconds.") with the
project logger (e.g., logger.error or similar) to follow logging conventions and
include contextual information (context, query, etc.) when logging the timeout.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 4db08edd-431e-44c0-85b4-35eb91bcb224

📥 Commits

Reviewing files that changed from the base of the PR and between 608565f and 95be5fa.

📒 Files selected for processing (10)
  • README.md
  • backend/app/api/routes/chat.py
  • backend/app/core/config.py
  • backend/app/main.py
  • backend/app/services/embedding.py
  • backend/app/services/llm.py
  • backend/requirements.txt
  • backend/tests/e2e/test_chat_endpoint.py
  • backend/tests/integration/test_supabase_match.py
  • frontend/components/chat/ChatMain.tsx

Comment thread backend/app/api/routes/chat.py
Comment thread backend/app/main.py Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
backend/app/main.py (1)

9-15: asyncio 중복 import를 정리해 주세요.

Line [9]와 Line [15]에서 동일 모듈을 중복 import하고 있습니다. 한 줄만 남겨 주세요.

패치 제안
 from app.api.routes import chat
 from app.core.rate_limit import limiter
 import asyncio
 from contextlib import asynccontextmanager
 import logging
 
 logger = logging.getLogger(__name__)
-
-import asyncio
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/main.py` around lines 9 - 15, 파일 상단에 asyncio 모듈이 중복으로 import 되어
있으니 중복된 import 문 중 하나를 제거하여 하나의 import asyncio만 남기세요; 대상 식별자는 모듈명 asyncio와 파일의
상단 import 블록(현재 import asyncio가 두 번 선언된 위치)입니다.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/app/main.py`:
- Around line 94-96: The readiness_check function currently uses a broad except
Exception as e: which triggers BLE001; to make the intent explicit leave the
broad catch but annotate it with the lint-ignore and rationale by changing the
except to include a trailing comment (e.g., add " # noqa: BLE001 - Service
boundary layer catches all preload failures") next to the except Exception as
e:, keeping the logger.warning("Preload task failed during readiness check: %s",
e) and the JSONResponse({"status": "failed"}, status_code=503) behavior
unchanged so the service boundary continues to convert any preload
initialization error into a 503 while suppressing the BLE001 warning.

In `@backend/app/services/llm.py`:
- Line 137: The warning currently logs the raw user input via
logger.warning(f"... Query: {query}"); change it to avoid storing PII by logging
only non-identifying info (e.g., a short hash or metadata). Replace the f-string
that references query with a redacted identifier: compute a short SHA-256 (or
similar) of query (using hashlib.sha256(query.encode('utf-8')).hexdigest()[:8])
or log query length/placeholder, and include that hash/id in the logger.warning
call instead of the raw query; ensure you add the hashlib import if you choose
hashing and keep references to the same logger.warning and query variable names.

---

Nitpick comments:
In `@backend/app/main.py`:
- Around line 9-15: 파일 상단에 asyncio 모듈이 중복으로 import 되어 있으니 중복된 import 문 중 하나를
제거하여 하나의 import asyncio만 남기세요; 대상 식별자는 모듈명 asyncio와 파일의 상단 import 블록(현재 import
asyncio가 두 번 선언된 위치)입니다.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 54e9c16e-f904-4359-b54f-fc238f8cc7d8

📥 Commits

Reviewing files that changed from the base of the PR and between 95be5fa and 2c0f465.

📒 Files selected for processing (3)
  • backend/app/api/routes/chat.py
  • backend/app/main.py
  • backend/app/services/llm.py

Comment thread backend/app/main.py
Comment on lines +94 to 96
except Exception as e:
logger.warning("Preload task failed during readiness check: %s", e)
return JSONResponse({"status": "failed"}, status_code=503)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# 1) main.py에서 preload/readiness 예외 처리 위치 확인
rg -n -C3 'def preload_models|def readiness_check|preload_task.result|except Exception as e|except Exception:' backend/app/main.py

# 2) preload 체인 대상 파일 확인
fd 'embedding.py|llm.py' backend/app/services

# 3) 관련 서비스의 명시적 raise/except 타입 확인
rg -n -C2 'raise\s+' backend/app/services/ 2>/dev/null | head -100

Repository: SanghunYun95/philo-rag

Length of output: 2990


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check embedding service - embeddings property
rg -n -A20 '@property' backend/app/services/embedding.py | grep -A20 'embeddings'

# Check get_llm function implementation
rg -n -A15 'def get_llm' backend/app/services/llm.py

# Check imports and other potential exceptions
head -50 backend/app/services/embedding.py
head -50 backend/app/services/llm.py

Repository: SanghunYun95/philo-rag

Length of output: 5774


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check if there are try-catch blocks or exception handling patterns in the service files
rg -n 'except|raise' backend/app/services/embedding.py backend/app/services/llm.py

# Check the full get_llm implementation to see if other exceptions could occur
tail -20 backend/app/services/llm.py

# Check preload_models implementation again to understand full context
sed -n '19,45p' backend/app/main.py

Repository: SanghunYun95/philo-rag

Length of output: 2005


readiness_check의 광범위 예외 처리를 명확히 해 주세요.

Line 94의 except Exception as e:는 BLE001 경고 대상입니다. 이는 서비스 경계층의 준비성 검사 엔드포인트에서 preload 작업의 모든 실패를 포착하여 503 상태 반환하는 의도된 패턴입니다. HuggingFace와 Google Generative AI 라이브러리 초기화 시 RuntimeError/ValueError 외 다양한 예외가 발생할 수 있으므로, 예외 범위를 좁히기보다는 의도를 명확히 하기 위해 주석과 함께 # noqa: BLE001을 추가하는 것이 권장됩니다:

except Exception as e:  # noqa: BLE001 - Service boundary layer catches all preload failures
🧰 Tools
🪛 Ruff (0.15.2)

[warning] 94-94: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/main.py` around lines 94 - 96, The readiness_check function
currently uses a broad except Exception as e: which triggers BLE001; to make the
intent explicit leave the broad catch but annotate it with the lint-ignore and
rationale by changing the except to include a trailing comment (e.g., add " #
noqa: BLE001 - Service boundary layer catches all preload failures") next to the
except Exception as e:, keeping the logger.warning("Preload task failed during
readiness check: %s", e) and the JSONResponse({"status": "failed"},
status_code=503) behavior unchanged so the service boundary continues to convert
any preload initialization error into a 503 while suppressing the BLE001
warning.

except asyncio.TimeoutError:
import logging
logger = logging.getLogger(__name__)
logger.warning(f"LLM stream chunk timed out after 30 seconds. Query: {query}")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

타임아웃 로그에 사용자 질의 원문을 남기지 마세요.

Line 137은 사용자 입력을 그대로 로그에 기록하고 있어 운영 로그에 민감정보가 저장될 수 있습니다. 원문 대신 비식별 정보만 남기도록 바꿔주세요.

🔧 수정 예시
-        logger.warning(f"LLM stream chunk timed out after 30 seconds. Query: {query}")
+        logger.warning("LLM stream chunk timed out after 30 seconds")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
logger.warning(f"LLM stream chunk timed out after 30 seconds. Query: {query}")
logger.warning("LLM stream chunk timed out after 30 seconds")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/services/llm.py` at line 137, The warning currently logs the raw
user input via logger.warning(f"... Query: {query}"); change it to avoid storing
PII by logging only non-identifying info (e.g., a short hash or metadata).
Replace the f-string that references query with a redacted identifier: compute a
short SHA-256 (or similar) of query (using
hashlib.sha256(query.encode('utf-8')).hexdigest()[:8]) or log query
length/placeholder, and include that hash/id in the logger.warning call instead
of the raw query; ensure you add the hashlib import if you choose hashing and
keep references to the same logger.warning and query variable names.

@SanghunYun95 SanghunYun95 merged commit 21c1bb5 into main Mar 6, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant