Feature/auth guard improvement by SanghunYun95 · Pull Request #38 · SanghunYun95/philo-rag

SanghunYun95 · 2026-03-29T13:04:37Z

Summary by CodeRabbit

New Features
- 평가 대시보드 추가 — 필터·페이지네이션·확장 상세보기 및 집계 통계 제공
- 자동 대화 평가 및 로깅 기능과 최신 평가 조회 API 추가
- 사이드바에 성능 대시보드 네비게이션 링크 추가
- 대시보드 접근 제어(관리자 키 필요) 도입
Improvements
- 실시간 평가 지표 확장 — 토큰 사용량·응답 지연 등 성능 신호 포함
- 시스템 프롬프트 강화 및 한국어 지침 명확화
Documentation
- 배포 URL 갱신 및 향후 고도화 로드맵(보안 다계층, 다국어/모델 벤치마크) 추가

…rated string

coderabbitai · 2026-03-29T13:05:26Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

LangGraph 기반 RAG 워크플로우와 평가 로깅 서비스가 추가되었고, 채팅 SSE 스트리밍이 LangGraph 이벤트 스트림으로 전환되었으며 평가 대시보드 페이지·관리자 인증·관련 유닛테스트 및 유틸 스크립트가 도입되었습니다.

Changes

Cohort / File(s)	Summary
LangGraph RAG 워크플로우 `backend/app/services/graph.py`	쿼리 재작성→임베딩 RPC 검색→문서 등급화→답변 생성 노드와 조건부 재시도 루프를 가진 LangGraph 워크플로우 추가. 공유된 Typed state와 노드/엣지 구성 포함.
평가 서비스 및 의존성 `backend/app/services/evaluation.py`, `backend/requirements.txt`	비동기 RAGAS 기반 `evaluate_and_log` 추가 — 백그라운드 스레드에서 평가 실행, 결과 정규화 및 Supabase `eval_logs` 삽입. `langgraph`, `ragas`, `pandas` 의존성 추가.
채팅 스트리밍 엔드포인트 `backend/app/api/routes/chat.py`	generate_chat_events 흐름을 LangGraph `astream_events` 기반으로 전환. 메타데이터/토큰 스트리밍 위치 변경, SSE 에러 단일화, `background_tasks` 파라미터 추가, `GET /eval-logs` 인증 라우트 추가.
LLM 서비스·프롬프트 `backend/app/services/llm.py`	`get_llm` 시그니처 확장(new_instance/structured_schema), 인스턴스 생성 로직 분리, 구조화 출력 지원 및 강화된 한국어 RAG 시스템 프롬프트 적용.
서버 시작·준비 검사 `backend/app/main.py`	lifespan에서 모델 초기화를 직접 await하도록 변경(백그라운드 preload 제거), `/ready` 엔드포인트 단순화, CORS 도메인 확장.
평가 대시보드(프론트엔드) `frontend/app/dashboard/page.tsx`, `frontend/app/api/eval-logs/route.ts`	새 클라이언트 대시보드 페이지 추가(로그 조회, 필터·페이징·통계) 및 서버사이드 프록시 API(`GET /api/eval-logs`) 추가 — 관리자 키 검증 포함.
프론트엔드 네비게이션·설정 `frontend/components/sidebar/Sidebar.tsx`, `frontend/next.config.ts`	사이드바에 대시보드 링크 추가(링크 컴포넌트로 변경), `devIndicators: false` 설정 추가.
인증·설정 `backend/app/core/auth.py`, `backend/app/core/config.py`	관리자용 `x-admin-key` 헤더 기반 인증 의존성(`get_current_user`) 추가 및 `Settings`에 `ADMIN_SECRET_KEY`, `ENV` 필드와 검증 로직 강화.
테스트·유틸 스크립트 `backend/tests/unit/test_evaluation.py`, `backend/tests/unit/test_graph.py`, `tmp/check_db.py`, `tmp/check_ragas_types.py`	평가·그래프 유닛테스트 추가, Supabase 연결 및 RAGAS 타입 점검용 임시 스크립트 추가.
README·문서 `README.md`, `.agent/documents/stories/001.advanced_rag_system.md`	배포 URL 업데이트, 보안 강화 로드맵 확장, 향후 개선 사항(다국어/보안/실시간 지표 확장) 문서화.
인증 관련 라우트(프론트/백엔드 연결) `frontend/app/api/eval-logs/route.ts`, `backend/app/api/routes/chat.py`	프론트엔드에서 쿠키 기반 `x-admin-key` 전달·검증 후 백엔드 `/api/v1/chat/eval-logs` 프록시 호출로 로그 조회 흐름 연결.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant Frontend
    participant ChatAPI as Chat API
    participant Graph as LangGraph<br/>Workflow
    participant LLM
    participant Embedding as Embedding<br/>Service
    participant Supabase as Supabase<br/>(Documents)
    participant Eval as Evaluation<br/>Service
    participant EvalDB as Supabase<br/>(eval_logs)

    User->>Frontend: 쿼리 전송
    Frontend->>ChatAPI: POST /chat (query, history)
    ChatAPI->>Graph: graph.astream_events()

    rect rgba(100, 150, 200, 0.5)
    Note over Graph: 워크플로우 실행
    Graph->>LLM: rewrite_query() → reformulated_query
    Graph->>Embedding: retrieve() → embedding
    Embedding-->>Supabase: match_documents RPC
    Supabase-->>Graph: documents
    Graph->>LLM: grade_documents() → is_relevant
    alt is_relevant == true or retry_count >= 2
        Graph->>LLM: generate() → answer
    else
        Graph->>Graph: rewrite_query() (재시도)
    end
    end

    Graph-->>ChatAPI: final_state (answer, documents)
    ChatAPI->>Frontend: SSE stream (metadata + final chunks)

    rect rgba(150, 200, 100, 0.5)
    Note over ChatAPI,EvalDB: 백그라운드 평가 로깅
    ChatAPI->>Eval: background_tasks.add_task(evaluate_and_log)
    Eval->>LLM: RAGAS 평가 실행
    Eval->>EvalDB: insert eval_logs
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

Feat/rag chat and error handling #9: chat 스트리밍 파이프라인 및 핸들러 시그니처 변경과 로직 중복(스트리밍/백그라운드 작업 관련).
Feat/rag chat and error handling #7: 채팅 스트리밍의 오류 처리 및 이벤트 방출 보강과 연관된 변경점.
Feat/rag chat and error handling #8: 채팅 스트리밍·LLM 초기화 및 데이터 접근 관련 수정과 코드영역 겹침 가능성.

Poem

🐰 로그를 보며 깡충댕기네,
영어로 물어 문서를 모으고,
점수 매겨 답을 다듬었지요.
대시보드 반짝, 토끼가 박수 보냅니다 —
우다다 축하해요! 🎉🥕

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Title check	⚠️ Warning	PR 제목은 'Feature/auth guard improvement'로, 인증 보안 개선과 관련된 내용을 다루고 있습니다. 그러나 실제 변경사항은 인증 가드뿐만 아니라 LangGraph 기반 RAG 워크플로우, RAGAS 평가 서비스, 대시보드 페이지, 평가 로그 엔드포인트 등 매우 광범위합니다. 제목이 전체 변경사항의 일부만을 지칭하고 있어 주요 변경점을 완전히 반영하지 못하고 있습니다.	PR 제목을 더 포괄적으로 수정하여 LangGraph 통합, RAGAS 평가, 대시보드 기능 추가 등 주요 변경사항을 반영하도록 제안합니다. 예: 'Implement LangGraph RAG workflow with RAGAS evaluation and dashboard'
Docstring Coverage	⚠️ Warning	Docstring coverage is 63.33% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feature/auth-guard-improvement

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 16

🧹 Nitpick comments (1)

backend/tests/unit/test_graph.py (1)
38-51: 스트리밍 계약도 같이 고정해 두는 편이 안전합니다.

실제 런타임 경로는 backend/app/api/routes/chat.py에서 graph.astream_events(...)를 사용합니다. 지금처럼 ainvoke만 확인하면 스트리밍 표면이 깨져도 이 테스트는 계속 통과합니다.
예시 수정
     graph = create_graph()
     assert graph is not None
     # graph가 실행 가능한지 여부만 가볍게 체크
     assert hasattr(graph, "ainvoke")
+    assert hasattr(graph, "astream_events")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/tests/unit/test_graph.py` around lines 38 - 51, Update the unit test
to also verify the streaming contract by checking that the Graph returned from
create_graph exposes the astream_events surface used at runtime; specifically,
after creating graph via create_graph() (and the existing check for
graph.ainvoke), add an assertion that graph has attribute "astream_events" and
that it is callable (or an async iterable/async function as expected) so the
streaming API used in backend/app/api/routes/chat.py is validated at test-time.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.agent/documents/stories/001.advanced_rag_system.md:
- Around line 51-54: The TODO list lists a dashboard and /eval-logs endpoint as
future work but those are already implemented in the branch; update the TODOs to
avoid confusion by either removing or marking the dashboard item as completed
and narrowing any remaining tasks, and ensure the documentation references the
existing dashboard component (page.tsx) and the chat API route that serves
/eval-logs (chat.py) so the item accurately reflects outstanding work only.

In `@backend/app/api/routes/chat.py`:
- Around line 80-82: The streaming handler currently breaks when
request.is_disconnected() is true but still proceeds to the background task
block that enqueues evaluation logs, causing partial full_answer to be recorded;
modify the flow in the chat streaming endpoint (the block using
request.is_disconnected(), the background task registration and the eval_logs
enqueue) so that you only register/enqueue evaluation tasks when the response
was completed successfully (e.g., track a completed flag or check that streaming
finished without early disconnect before calling the eval_logs
enqueue/background task code and before using full_answer).
- Around line 154-165: The /eval-logs endpoint is currently publicly accessible;
update the GET handler get_eval_logs in backend/app/api/routes/chat.py to
require authentication by adding a FastAPI dependency such as
Depends(get_current_user) (or a JWT validation dependency) and return 401 on
failure, ensure get_current_user (or equivalent) is implemented in your auth
module and imported; register an authentication middleware or dependency
enforcement in backend/app/main.py so protected routes are validated globally or
via router.include_router with dependencies; and update the frontend dashboard
route/component (the dashboard page) to enforce a route guard and send the auth
token (or session) when calling /eval-logs so only authenticated users can fetch
evaluation logs.

In `@backend/app/services/evaluation.py`:
- Around line 32-33: The logger.info call in evaluate_and_log currently writes
the full user query/response payload to application logs; change it to avoid
logging raw content and instead log only summary metadata (e.g., query_id or
eval_id, lengths like character/word/token counts, and any non-sensitive status
flags). Locate the logger.info usages inside the evaluate_and_log function (and
the similar logging calls later in the same function that log original content)
and replace the message to include only identifiers and numeric/boolean metadata
(for example: f"Starting async evaluate_and_log id={eval_id}
query_len={len(query)} tokens={token_count}"), ensuring no plain user input or
generated answer text is emitted.
- Around line 42-56: The heavy synchronous initializations
(get_llm(new_instance=True) and reading embedding_service.embeddings) are
currently executed on the event loop before launching the background thread;
move those calls inside the _run_evaluate function so they run inside the thread
alongside evaluate. Concretely, relocate the get_llm(new_instance=True) and
embeddings = embedding_service.embeddings into the body of _run_evaluate (before
calling evaluate), keep the evaluate(...) call and the metrics list
(Faithfulness(), AnswerRelevancy()) unchanged, and continue to call result =
await asyncio.to_thread(_run_evaluate).

In `@backend/app/services/graph.py`:
- Around line 68-89: The retrieve function performs a blocking Supabase RPC via
asyncio.to_thread without concurrency control; add a semaphore guard around the
blocking RPC call (acquire before calling get_client().rpc(...) via
asyncio.to_thread and release after) to limit concurrent DB RPCs and avoid
exhausting the thread pool. Either import and use the existing DB RPC semaphore
used elsewhere (if available) or add a module-level asyncio.Semaphore (e.g.,
rpc_semaphore) and wrap the _search/asyncio.to_thread call in an async with
rpc_semaphore: block so embedding_service.agenerate_embedding and the subsequent
get_client().rpc(...) call are protected.

In `@backend/app/services/llm.py`:
- Around line 97-103: The two prompt rules conflict: "핵심 규칙 1" mandates
answering only from [Context], while rule 3 allows giving philosophical advice
even without context; resolve by adding an explicit precedence and clarified
condition — update the rules block so that Rule 1 always takes precedence unless
the input is classified as "philosophical/ethico-social" by a clear heuristic
(e.g., contains keywords like 왜, 의미, 삶, 감정, 사회) in which case Rule 1 is relaxed
and Rule 3 applies; also revise Rule 3 text to state it only applies when the
question meets that "philosophical" classification and to require explicitly
prefacing responses with "데이터베이스에서 관련 구절을 찾지 못했습니다." before providing cautious
advice, ensuring deterministic behavior across the "핵심 규칙" block.

In `@backend/requirements.txt`:
- Around line 16-18: The requirements pin is missing and evaluation.py uses
deprecated ragas imports; update backend/requirements.txt to pin ragas to at
least v0.4.3 (e.g., ragas>=0.4.3) and migrate backend/app/services/evaluation.py
to the collections-based API used in tmp/check_ragas_types.py: replace old
imports like from ragas import evaluate and from ragas.metrics import
Faithfulness with the new collection-style imports (e.g., from
ragas.metrics.collections import faithfulness) and adjust any
evaluate/Faithfulness usage to the corresponding functions or call patterns in
the collections API so the module matches tmp/check_ragas_types.py and avoids
deprecation/failures on v1.0.

In `@backend/tests/unit/test_evaluation.py`:
- Around line 24-49: The test calls evaluate_and_log but still lets runtime
imports initialize heavy LLM/embedding code; extend the patches to also mock
app.services.llm.get_llm and app.services.embedding.embedding_service (or any
symbol used inside evaluate_and_log that triggers model/embedding
initialization) so the function uses lightweight MagicMocks instead of real
clients; keep the existing mocks for evaluate and get_client and set
return_values on the new mocks to simple stubs so the test remains a pure unit
test without environment-dependent initialization.

In `@frontend/app/dashboard/page.tsx`:
- Around line 39-42: 필터링 후 페이지 보정이 빠져 있어 후반 페이지에서 필터로 결과가 줄면 currentLogs가 비어 보이는
문제가 있습니다; filteredLogs와 itemsPerPage로 계산한 totalPages를 사용해 필터 변경 시(또는
filteredLogs가 바뀔 때) currentPage를 1로 리셋하거나 setCurrentPage(prev => Math.min(prev,
totalPages))로 클램프하도록 수정하세요; 관련 식별자는 filter/setFilter,
currentPage/setCurrentPage, filteredLogs, currentLogs, itemsPerPage이며 동일 로직을 적용할
위치는 filter 변경 핸들러와 filteredLogs를 사용하는 렌더/페이징 계산부(해당되는 63-72, 119-120 영역)입니다.
- Around line 187-225: The row toggle is only onClick on <tr> so keyboard users
can't open it; move the interactive toggle into a focusable button inside the
row (e.g., render a <button> inside a td that spans the clickable area and uses
onClick={() => setExpandedId(expandedId === log.id ? null : log.id)}), add
aria-expanded={expandedId === log.id} and aria-controls={`details-${log.id}`} on
that button, give the expanded panel the matching id (`details-${log.id}`), and
keep the ChevronUp/ChevronDown inside that button so both mouse and keyboard can
toggle the expansion; ensure the button has accessible styling (full width,
text-left) and that any existing onClick on <tr> is removed.

In `@frontend/components/sidebar/Sidebar.tsx`:
- Around line 82-89: The Link for the dashboard currently strips the focus
outline (class focus-visible:outline-none) leaving keyboard users without a
visible focus; update the Link's focus styles (on the Link element rendered in
Sidebar component — symbol: Link with className containing
"focus-visible:outline-none") to replace that class with an accessible visible
focus ring such as focus-visible:ring-2 and a matching color/offset (e.g.
focus-visible:ring-[`#d9b74a`]/70 and focus-visible:ring-offset-2) so the element
keeps a clear visual focus indicator while preserving existing hover/active
styles.

In `@README.md`:
- Around line 3-8: Open README.md, remove the leftover Git merge conflict
markers (<<<<<<<, =======, >>>>>>>) and choose the correct site URL line to keep
(either "https://philo-rag.vercel.app/" or "https://philo-rag.web.app/"),
removing the other conflicting line so the README shows a single, clean URL and
consistent formatting; ensure there are no remaining conflict markers anywhere
in the file.
- Around line 4-5: README에 기재된 새 프런트엔드 도메인(https://philo-rag.vercel.app)이 백엔드의
CORS 허용 목록에 누락되어 있어 브라우저에서 API 호출이 차단됩니다; backend/app/main.py에서 CORSMiddleware
설정(allow_origins 파라미터)을 찾아 현재 리스트에 새 도메인과 필요하면 하위 경로/대시보드 도메인(예:
https://philo-rag.vercel.app, https://philo-rag.vercel.app/dashboard 또는 와일드카드)을
추가하거나 환경변수/설정 변수를 사용해 동적으로 포함되도록 수정하여 CORS 허용 목록을 업데이트하세요.

In `@tmp/check_db.py`:
- Around line 16-18: The current check uses db.table("eval_logs").select("*")
and prints response.data which may expose sensitive fields; change the query to
only request safe columns (e.g., select specific column names instead of "*") or
use an aggregate/count (e.g., select("id").limit(1) or a count) and log only a
success message and minimal metadata (presence/row count or selected safe column
values) instead of printing response.data.
- Around line 1-8: Don't rely on os.getcwd() and default load_dotenv(); compute
paths relative to this script file instead and pass explicit paths: derive the
repo root from __file__ (use Path(__file__).resolve().parent / parent as
appropriate) to build the backend path for sys.path.append and to construct the
.env file path to pass into load_dotenv(dotenv_path=...), ensuring imports and
env loading work regardless of the current working directory; update references
in this file where os.getcwd(), sys.path.append(...), and load_dotenv() are used
(e.g., the sys.path.append call and the load_dotenv invocation).

---

Nitpick comments:
In `@backend/tests/unit/test_graph.py`:
- Around line 38-51: Update the unit test to also verify the streaming contract
by checking that the Graph returned from create_graph exposes the astream_events
surface used at runtime; specifically, after creating graph via create_graph()
(and the existing check for graph.ainvoke), add an assertion that graph has
attribute "astream_events" and that it is callable (or an async iterable/async
function as expected) so the streaming API used in
backend/app/api/routes/chat.py is validated at test-time.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 17c1f444-140c-454e-9ffa-0a5a2057e9c3

📥 Commits

Reviewing files that changed from the base of the PR and between 7c7f48e and bfd43cd.

📒 Files selected for processing (15)

.agent/documents/stories/001.advanced_rag_system.md
README.md
backend/app/api/routes/chat.py
backend/app/main.py
backend/app/services/evaluation.py
backend/app/services/graph.py
backend/app/services/llm.py
backend/requirements.txt
backend/tests/unit/test_evaluation.py
backend/tests/unit/test_graph.py
frontend/app/dashboard/page.tsx
frontend/components/sidebar/Sidebar.tsx
frontend/next.config.ts
tmp/check_db.py
tmp/check_ragas_types.py

coderabbitai · 2026-03-29T13:17:09Z

+<<<<<<< HEAD
+**실제 배포된 사이트 URL:** https://philo-rag.vercel.app/
+**평가 대시보드 URL:** https://philo-rag.vercel.app/dashboard (준비 중)
+=======
 **사이트 URL:** [https://philo-rag.web.app/](https://philo-rag.web.app/)
+>>>>>>> 7c7f48ee77ac78913010df69b111f3eb2448d384


⚠️ Potential issue | 🟠 Major

README 머지 충돌 마커를 제거하세요.

상단에 <<<<<<<, =======, >>>>>>>가 그대로 남아 있어서 문서가 바로 깨져 보입니다. 이 상태로 merge하면 배포 URL 안내도 혼란스러워집니다.

예시 수정안

-<<<<<<< HEAD **실제 배포된 사이트 URL:** https://philo-rag.vercel.app/ **평가 대시보드 URL:** https://philo-rag.vercel.app/dashboard (준비 중) -======= -**사이트 URL:** [https://philo-rag.web.app/](https://philo-rag.web.app/) ->>>>>>> 7c7f48ee77ac78913010df69b111f3eb2448d384

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

<<<<<<< HEAD

**실제 배포된 사이트 URL:** https://philo-rag.vercel.app/

**평가 대시보드 URL:** https://philo-rag.vercel.app/dashboard (준비 중)

=======

**사이트 URL:** [https://philo-rag.web.app/](https://philo-rag.web.app/)

>>>>>>> 7c7f48ee77ac78913010df69b111f3eb2448d384

**실제 배포된 사이트 URL:** https://philo-rag.vercel.app/

**평가 대시보드 URL:** https://philo-rag.vercel.app/dashboard (준비 중)

🧰 Tools

🪛 markdownlint-cli2 (0.22.0)

[warning] 3-3: Heading style
Expected: atx; Actual: setext

(MD003, heading-style)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@README.md` around lines 3 - 8, Open README.md, remove the leftover Git merge conflict markers (<<<<<<<, =======, >>>>>>>) and choose the correct site URL line to keep (either "https://philo-rag.vercel.app/" or "https://philo-rag.web.app/"), removing the other conflicting line so the README shows a single, clean URL and consistent formatting; ensure there are no remaining conflict markers anywhere in the file.

coderabbitai

Actionable comments posted: 7

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/app/api/routes/chat.py`:
- Around line 163-169: The except block in the eval_logs query swallows DB
errors by returning an empty list; instead surface failures as a 5xx so callers
can distinguish "no logs" from "backend error". Change the except in the block
that calls get_client().table("eval_logs")...execute() to either re-raise the
caught exception or raise a FastAPI HTTPException(status_code=500,
detail="Failed to fetch evaluation logs") (import HTTPException from fastapi),
and keep logger.exception("Failed to fetch evaluation logs") for logging; do not
return [] on exception.

In `@backend/app/core/config.py`:
- Around line 18-19: The ADMIN_SECRET_KEY currently has a hardcoded default
("dev-secret-key") that bypasses auth; change ADMIN_SECRET_KEY in config to have
no insecure default (e.g., Optional[str] = None or empty string) and update
validate_required_settings() to treat ADMIN_SECRET_KEY as required in
non-development runs (raise on missing/empty) while allowing a clear
development-only branch if needed (e.g., only skip validation when settings.ENV
== "development"). Ensure references to ADMIN_SECRET_KEY and
validate_required_settings() are updated accordingly so deployments fail fast
when the secret is not provided.

In `@backend/app/main.py`:
- Around line 21-30: The synchronous model initialization in the async lifespan
is blocking the event loop; wrap the blocking calls to
embedding_service.embeddings and get_llm() with asyncio.to_thread and await them
so they run on a worker thread while the lifespan remains async; update the
startup block to import asyncio if needed, call await asyncio.to_thread(lambda:
embedding_service.embeddings) and await asyncio.to_thread(get_llm) (or
equivalent callables) to offload HuggingFace downloads and ChatOpenAI init to a
thread and still wait for completion.

In `@backend/app/services/graph.py`:
- Around line 122-124: The call get_llm().with_structured_output(GradeDocuments)
fails when get_llm() returns a RunnableWithFallbacks (from with_fallbacks());
fix by ensuring structured output is applied on the underlying BaseChatModel
instances before creating fallbacks—update _create_llm_instance() to call
with_structured_output(GradeDocuments) on each BaseChatModel (or create a new
initializer used by graph.py) and only then compose them via with_fallbacks(),
so that get_llm() returns a RunnableWithFallbacks wrapping models that already
implement structured output; adjust graph.py to use the updated get_llm() (or
the new initializer) when building chain.

In `@backend/app/services/llm.py`:
- Line 50: Remove the unnecessary f-string prefix on the print call in llm.py by
replacing print(f"Initializing OpenAI instance...") with either
print("Initializing OpenAI instance...") or, to match the rest of the file,
replace it with a logging call (e.g., logger.info("Initializing OpenAI
instance...")) using the module's existing logger variable so the message is not
an f-string and Ruff F541 is avoided.

In `@frontend/app/dashboard/page.tsx`:
- Line 1: The page is currently a client component ("use client") which exposes
NEXT_PUBLIC_ADMIN_SECRET_KEY to the browser; remove the admin-key usage from the
client by converting this component back to a server component (remove "use
client") or by creating a Next.js API route/server action that performs the
backend call to /eval-logs using a non-public environment variable (e.g.,
process.env.ADMIN_SECRET_KEY, not NEXT_PUBLIC_*). Implement an API handler
(e.g., pages/api/eval-logs or app/api/eval-logs/route) that reads the private
env var, performs the fetch to the backend, enforces session-based access
control, and have the client (or a minimal client component) call that API route
instead of referencing NEXT_PUBLIC_ADMIN_SECRET_KEY directly in frontend code
such as dashboard/page.tsx.

In `@tmp/check_db.py`:
- Around line 31-32: The except block that catches exceptions (the "except
Exception as e:" handler that currently prints "Error connecting to Supabase or
reading from eval_logs: {e}") should terminate the process with a non‑zero exit
code so CI/health checks detect failure; update that handler to log or print the
error and then call sys.exit(1) (or raise SystemExit(1)) after importing sys,
ensuring the script returns a failing status on DB connection/read errors.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 4a46dbad-bd98-41ea-9587-2516aaace0a9

📥 Commits

Reviewing files that changed from the base of the PR and between bfd43cd and 91d1000.

📒 Files selected for processing (15)

.agent/documents/stories/001.advanced_rag_system.md
README.md
backend/app/api/routes/chat.py
backend/app/core/auth.py
backend/app/core/config.py
backend/app/main.py
backend/app/services/evaluation.py
backend/app/services/graph.py
backend/app/services/llm.py
backend/requirements.txt
backend/tests/unit/test_evaluation.py
backend/tests/unit/test_graph.py
frontend/app/dashboard/page.tsx
frontend/components/sidebar/Sidebar.tsx
tmp/check_db.py

✅ Files skipped from review due to trivial changes (3)

.agent/documents/stories/001.advanced_rag_system.md
README.md
backend/requirements.txt

🚧 Files skipped from review as they are similar to previous changes (4)

frontend/components/sidebar/Sidebar.tsx
backend/tests/unit/test_graph.py
backend/tests/unit/test_evaluation.py
backend/app/services/evaluation.py

…locking, and structured output

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (2)

frontend/app/dashboard/page.tsx (1)
50-71: ⚠️ Potential issue | 🟠 Major

조회 실패를 빈 상태로 렌더링하지 마세요.

비정상 응답과 예외를 console.error만 하고 loading만 내리기 때문에, 401/500도 결국 "데이터가 없습니다."로 보입니다. 운영용 화면에서는 "로그 없음"과 "인증 실패/백엔드 장애"를 구분해야 하니, 별도의 error 상태를 두고 전용 메시지나 리다이렉트를 렌더링하는 편이 안전합니다.

Also applies to: 177-185
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@frontend/app/dashboard/page.tsx` around lines 50 - 71, The fetchLogs inside
useEffect currently swallows non-OK responses and exceptions into console.error
and only toggles setLoading, causing the UI to show "no data" for errors; add an
error state (e.g., error and setError) and update it from fetchLogs on non-OK
responses, non-array payloads, and caught exceptions (use the same places where
you call console.error), then render or redirect from the component based on
error instead of treating all failures as empty logs; update references to
setLoading to still occur in finally but ensure error is set before
setLoading(false) so the UI can show an error-specific message.
frontend/app/api/eval-logs/route.ts (1)
3-28: ⚠️ Potential issue | 🔴 Critical

이 프록시는 호출자 인증이 없어서 평가 로그를 그대로 공개합니다.

서버 비밀키를 브라우저에서 숨긴 것만으로는 보호가 되지 않습니다. 지금은 누구든 /api/eval-logs를 호출하면 이 핸들러가 x-admin-key를 붙여 백엔드에 대신 접근해 주기 때문에, frontend/app/dashboard/page.tsx의 클라이언트 fetch와 합쳐서 평가 로그가 사실상 공개 상태입니다. 이 라우트에서 먼저 사용자 세션/쿠키/JWT를 검증하고, 실패 시 401을 반환한 뒤에만 백엔드 프록시를 수행해야 합니다.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@frontend/app/api/eval-logs/route.ts` around lines 3 - 28, The GET handler
currently proxies backend eval logs without authenticating the caller; update
the GET function to first validate the incoming user's session/auth token (e.g.,
verify session via getServerSession, inspect/verify a JWT from cookies, or
validate your app's auth cookie) and if validation fails return
NextResponse.json({ error: "Unauthorized" }, { status: 401 }); only after
successful auth proceed to use adminKey and
fetch(`${baseUrl}/api/v1/chat/eval-logs`); ensure the auth check happens before
any use of adminKey or proxying so the admin secret is never exposed to
unauthenticated requests.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/app/api/routes/chat.py`:
- Around line 163-173: The current exception handler returns the DB error string
to clients which leaks internals; keep the server-side logger.exception(...)
as-is but change the raised HTTPException in the except block (where
get_client(), table("eval_logs") and logger.exception are used) to return a
generic fixed error message (e.g. "Failed to fetch evaluation logs") without
including str(e) or other internal details, preserving the 500 status_code;
ensure no DB exception content is interpolated into HTTPException.detail.

In `@backend/app/core/config.py`:
- Around line 32-34: The single-line if statements using settings.SUPABASE_URL
and settings.SUPABASE_SERVICE_KEY trigger Ruff E701; replace each one-liner with
a full block form (e.g., if not settings.SUPABASE_URL: followed by a new
indented line that appends "SUPABASE_URL" to the missing list) for both checks
referencing settings.SUPABASE_URL and settings.SUPABASE_SERVICE_KEY so the
`missing` list is updated the same way but without single-line `if` statements.

---

Duplicate comments:
In `@frontend/app/api/eval-logs/route.ts`:
- Around line 3-28: The GET handler currently proxies backend eval logs without
authenticating the caller; update the GET function to first validate the
incoming user's session/auth token (e.g., verify session via getServerSession,
inspect/verify a JWT from cookies, or validate your app's auth cookie) and if
validation fails return NextResponse.json({ error: "Unauthorized" }, { status:
401 }); only after successful auth proceed to use adminKey and
fetch(`${baseUrl}/api/v1/chat/eval-logs`); ensure the auth check happens before
any use of adminKey or proxying so the admin secret is never exposed to
unauthenticated requests.

In `@frontend/app/dashboard/page.tsx`:
- Around line 50-71: The fetchLogs inside useEffect currently swallows non-OK
responses and exceptions into console.error and only toggles setLoading, causing
the UI to show "no data" for errors; add an error state (e.g., error and
setError) and update it from fetchLogs on non-OK responses, non-array payloads,
and caught exceptions (use the same places where you call console.error), then
render or redirect from the component based on error instead of treating all
failures as empty logs; update references to setLoading to still occur in
finally but ensure error is set before setLoading(false) so the UI can show an
error-specific message.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 2f698d08-b601-4836-a6f0-0c351d2cabc4

📥 Commits

Reviewing files that changed from the base of the PR and between 91d1000 and 577a9fd.

📒 Files selected for processing (8)

backend/app/api/routes/chat.py
backend/app/core/config.py
backend/app/main.py
backend/app/services/graph.py
backend/app/services/llm.py
frontend/app/api/eval-logs/route.ts
frontend/app/dashboard/page.tsx
tmp/check_db.py

🚧 Files skipped from review as they are similar to previous changes (3)

tmp/check_db.py
backend/app/main.py
backend/app/services/graph.py

…ate handling

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/app/api/routes/chat.py`:
- Around line 120-147: The stream handler on_chat_model_stream currently yields
a fallback when chunk_count == 0 and passes empty full_answer to evaluation,
potentially discarding the AgentState.answer; fix by checking
AgentState/final_state for a stored answer (e.g., final_state.get("answer"))
before scheduling evaluate_and_log and use that as the answer if full_answer is
empty, while still yielding the user-facing fallback text for the stream; ensure
evaluate_and_log receives the true answer, reformulated_query, documents, and
is_relevant from final_state rather than an empty string so eval logs match the
user-visible response.

In `@frontend/app/dashboard/page.tsx`:
- Around line 160-165: The badge always shows "Healthy" regardless of
stat.score; update the conditional rendering inside the block that checks
stat.score !== undefined so the displayed text matches the score band (e.g., use
stat.score >= 0.7 -> "Healthy", stat.score >= 0.4 -> "At risk" or "Warning",
else -> "Unhealthy"/"Critical"), keep the existing className color logic but
ensure the label text is chosen from the same thresholds (stat.score) so color
and text stay in sync; implement the text selection inline or via a small helper
function (e.g., getHealthLabel(score)) referenced where stat.score is used.
- Around line 95-112: The dashboard shows `logs.length` and the averages
(computed in `averages`) as if they represent the full dataset even though
`/api/eval-logs` only returns the most recent 50; either update the UI to
indicate these are recent-sample metrics (e.g., change the stat label from "전체
평가 수" to "최근 평가 수 (샘플)" and append a note to the average labels) or modify the
API call to return a true total (e.g., add `totalCount` in the response) and
then use that `totalCount` for the "전체 평가 수" stat while keeping `logs.length`
for the sample-based averages; locate and update the `logs`, `averages`, and
`stats` usage in this file and adjust the fetch logic for `/api/eval-logs` and
the displayed labels accordingly.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 44f19adf-72b3-4069-ae8d-473252b10662

📥 Commits

Reviewing files that changed from the base of the PR and between 577a9fd and f8b914a.

📒 Files selected for processing (4)

backend/app/api/routes/chat.py
backend/app/core/config.py
frontend/app/api/eval-logs/route.ts
frontend/app/dashboard/page.tsx

🚧 Files skipped from review as they are similar to previous changes (2)

backend/app/core/config.py
frontend/app/api/eval-logs/route.ts

…arity

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (1)

backend/app/api/routes/chat.py (1)
129-154: ⚠️ Potential issue | 🟠 Major

fallback 응답을 보낸 뒤 빈 answer로 평가를 enqueue하지 마세요.

Line 137-138에서 사용자에게 fallback 문구를 보내도 full_answer는 비어 있고, final_state는 문서만 있어도 truthy라 Line 143 조건을 통과합니다. 그 결과 evaluate_and_log(...)에 answer=""가 들어가 eval_logs.answer와 실제 노출 응답이 어긋나고 지표도 오염됩니다. full_answer가 있을 때만 enqueue하거나, fallback을 기록 대상으로 명시해 주세요.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app/api/routes/chat.py` around lines 129 - 154, The background
evaluation is being scheduled even when a fallback message was streamed and
full_answer is empty, causing evaluate_and_log to record answer=""; update the
enqueue condition in the streaming completion block (the logic around
chunk_count, full_answer, final_state, background_tasks) so that
background_tasks.add_task(evaluate_and_log, ...) is only called when full_answer
is non-empty (or explicitly pass a flag/label indicating this is a fallback) —
i.e., add full_answer truthiness to the if-check before scheduling
evaluate_and_log (or add a parameter like is_fallback and handle that inside
evaluate_and_log) to ensure eval_logs.answer matches the actually shown
response.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/app/api/routes/chat.py`:
- Around line 170-173: The async route is performing a synchronous Supabase call
directly (get_client().table(...).select(...).order(...).limit(...).execute()),
which blocks the event loop; change it to run that DB call inside
asyncio.to_thread and await it (e.g., await asyncio.to_thread(lambda:
get_client().table(...).select(...).order(...).limit(...).execute())) and then
return res.data; also ensure asyncio is imported in this module so the await
works.

In `@frontend/app/dashboard/page.tsx`:
- Around line 95-112: The cards show 0% during initial load or fetch failure
because averages are computed from logs=[]; change logic so averages are only
computed when logs.length > 0 AND not in loading/error state (use the existing
loading/error flags), otherwise set averages to null/undefined; update the stats
array (used by stats, stats[i].value and stats[i].score) to show a
placeholder/skeleton when score/value is null and avoid formatting (e.g., don't
call toFixed on null); also make getScoreColor(score: number | null | undefined)
tolerate null/undefined by returning a neutral/placeholder class for non-numeric
scores; reference variables/functions: averages, logs, getScoreColor, stats.

---

Duplicate comments:
In `@backend/app/api/routes/chat.py`:
- Around line 129-154: The background evaluation is being scheduled even when a
fallback message was streamed and full_answer is empty, causing evaluate_and_log
to record answer=""; update the enqueue condition in the streaming completion
block (the logic around chunk_count, full_answer, final_state, background_tasks)
so that background_tasks.add_task(evaluate_and_log, ...) is only called when
full_answer is non-empty (or explicitly pass a flag/label indicating this is a
fallback) — i.e., add full_answer truthiness to the if-check before scheduling
evaluate_and_log (or add a parameter like is_fallback and handle that inside
evaluate_and_log) to ensure eval_logs.answer matches the actually shown
response.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 4cd0be51-e542-4807-9db4-5c4e48878c0a

📥 Commits

Reviewing files that changed from the base of the PR and between f8b914a and ad39afe.

📒 Files selected for processing (2)

backend/app/api/routes/chat.py
frontend/app/dashboard/page.tsx

coderabbitai · 2026-03-29T14:20:13Z

+    try:
+        from app.services.database import get_client
+        res = get_client().table("eval_logs").select("*").order("created_at", desc=True).limit(50).execute()
+        return res.data


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's check if the file exists and see the context around lines 170-173 fd "chat.py" --path "*/api/routes/*"

Repository: SanghunYun95/philo-rag

Length of output: 281

🏁 Script executed:

# Also look for graph.py fd "graph.py" --path "*/services/*"

Repository: SanghunYun95/philo-rag

Length of output: 281

🏁 Script executed:

# Search for chat.py file fd "chat\.py"

Repository: SanghunYun95/philo-rag

Length of output: 135

🏁 Script executed:

# Search for graph.py file fd "graph\.py"

Repository: SanghunYun95/philo-rag

Length of output: 128

🏁 Script executed:

# Read chat.py with context around lines 170-173 wc -l backend/app/api/routes/chat.py

Repository: SanghunYun95/philo-rag

Length of output: 100

🏁 Script executed:

# Read chat.py lines 160-185 for full context cat -n backend/app/api/routes/chat.py | sed -n '160,185p'

Repository: SanghunYun95/philo-rag

Length of output: 1179

🏁 Script executed:

# Read graph.py lines 65-95 for the comparison pattern cat -n backend/app/services/graph.py | sed -n '65,95p'

Repository: SanghunYun95/philo-rag

Length of output: 1264

비동기 라우트에서 Supabase 조회를 asyncio.to_thread()로 오프로드하세요.

get_eval_logs 함수는 async 라우트인데 line 172에서 동기 DB 호출(.execute())을 직접 실행하고 있습니다. 같은 Supabase 클라이언트 호출이 backend/app/services/graph.py:83-90에서는 asyncio.to_thread()로 분리되어 있으므로, 일관성을 위해 여기서도 동일한 패턴을 적용하여 이벤트 루프 블로킹을 방지하세요.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@backend/app/api/routes/chat.py` around lines 170 - 173, The async route is performing a synchronous Supabase call directly (get_client().table(...).select(...).order(...).limit(...).execute()), which blocks the event loop; change it to run that DB call inside asyncio.to_thread and await it (e.g., await asyncio.to_thread(lambda: get_client().table(...).select(...).order(...).limit(...).execute())) and then return res.data; also ensure asyncio is imported in this module so the await works.

SanghunYun95 added 4 commits March 29, 2026 13:40

fix: increase Cloud Run memory to 2Gi to resolve OOM crash

99bb961

fix: change Cloud Run flags block style to folded (>-) for space-sepa…

6225f65

…rated string

feat: LLM 방어선 구축 및 대시보드 안내 문구 수정

dbf63d6

feat: LLM 방어선 구축 및 대시보드 안내 문구 수정

bfd43cd

coderabbitai bot reviewed Mar 29, 2026

View reviewed changes

chore: reflect PR comments and improve dashboard security/accessibility

91d1000

coderabbitai bot reviewed Mar 29, 2026

View reviewed changes

fix: address all remaining PR comments including security, lifespan b…

577a9fd

…locking, and structured output

coderabbitai bot reviewed Mar 29, 2026

View reviewed changes

Comment thread backend/app/api/routes/chat.py

Comment thread backend/app/core/config.py Outdated

fix: address PR feedback on API security, linter errors, and error st…

f8b914a

…ate handling

coderabbitai bot reviewed Mar 29, 2026

View reviewed changes

Comment thread backend/app/api/routes/chat.py

Comment thread frontend/app/dashboard/page.tsx

fix: address PR feedback on stream fallbacks and dashboard metrics cl…

ad39afe

…arity

coderabbitai bot reviewed Mar 29, 2026

View reviewed changes

fix: address PR feedback on async DB calls and dashboard loading states

78b20d2

SanghunYun95 merged commit edb25f0 into main Mar 29, 2026
1 check passed

Conversation

SanghunYun95 commented Mar 29, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (2 warnings)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

SanghunYun95 commented Mar 29, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 29, 2026 •

edited

Loading