SanghunYun95 · SanghunYun95 · Mar 28, 2026 · Mar 2, 2026 · Mar 2, 2026 · Mar 2, 2026
diff --git a/.agent/documents/bmad.md b/.agent/documents/bmad.md
@@ -6,7 +6,7 @@
 
 ## 1. 핵심 원칙 (Core Philosophy)
 
-- **Docs-as-Code:** 모든 기능의 시작은 `documents/stories/` 내의 스토리 파일입니다.
+- **Docs-as-Code:** 모든 기능의 시작은 `.agent/documents/stories/` 내의 스토리 파일입니다.
 - **Behavior-driven:** 기능은 사용자의 행동과 기대 결과(Acceptance Criteria) 중심으로 정의합니다.
 - **Model-based:** 복잡한 로직은 텍스트보다는 구조화된 모델(Mermaid 다이어그램, JSON 스키마 등)로 표현합니다.
 - **Context Integrity:** 문서를 스토리 단위로 쪼개어 AI가 필요한 정보에만 집중하게 합니다.
@@ -18,7 +18,7 @@
 ### 📋 [Analysis Phase] - 비즈니스 분석가 (Analyst)
 
 - **목표:** 모호한 요구사항을 명확한 '스토리(Story)'로 변환합니다.
-- **결과물:** `documents/stories/ID.story_name.md` (Gherkin 스타일의 Behavior 정의 포함)
+- **결과물:** `.agent/documents/stories/ID.story_name.md` (Gherkin 스타일의 Behavior 정의 포함)
 - **지침:** "사용자가 ~할 때, ~한 결과가 나와야 한다"는 비즈니스 로직에 집중합니다.
 
 ### 📐 [Architecture Phase] - 시스템 설계자 (Architect)
@@ -46,7 +46,7 @@
 ### ① 기획 및 설계 시나리오
 
 **지시:** "BMAD 스킬로 'AI 기반 계약 생애주기 관리(CLM) 플랫폼을 위한 공통 시스템(Shared System) 백엔드 코어 모듈' 스토리 파일 만들어줘."
-**AI 행동:** `documents/stories/001.clm-shared-system-core-module.md` 생성 후 승인 요청.
+**AI 행동:** `.agent/documents/stories/001.clm-shared-system-core-module.md` 생성 후 승인 요청.
 
 ### ② 프롬프트 예시
 

diff --git a/.agent/documents/improvement_plan.md b/.agent/documents/improvement_plan.md
@@ -42,4 +42,4 @@ LLM 기반 서비스의 보안을 위해 다음과 같은 전략을 수립합니
 ---
 
 > [!TIP]
-> **보안 마크다운 가이드라인**을 별도 문서로 관리하며, `back-end`의 검증 로직과 싱크를 맞추는 것을 권장합니다.
+> **보안 마크다운 가이드라인**을 별도 문서로 관리하며, `backend`의 검증 로직과 싱크를 맞추는 것을 권장합니다.
diff --git a/.agent/documents/stories/001.advanced_rag_system.md b/.agent/documents/stories/001.advanced_rag_system.md
@@ -18,7 +18,7 @@
 - **I want to** AI의 답변이 얼마나 정확하고(Faithfulness) 관련성이 높은지(Relevance) 수치로 확인하고 싶다.
 - **Acceptance Criteria:**
   - 답변 생성 후 RAGAS를 통해 Faithfulness, Answer Relevance 점수를 계산한다.
-  - 특정 점수 이하의 답변이 생성될 경우 로그를 기록하고 개선 프로세스를 실행한다.
+  - **Faithfulness 0.7 미만 또는 Answer Relevance 0.7 미만**의 답변이 생성될 경우 로그를 기록하고, 최대 2회까지 재검색 및 재생성을 시도하는 개선 프로세스를 실행한다.
   - 평가 결과를 대시보드나 로그 파일로 확인할 수 있다.
 
 ## 3. 아키텍처 설계 (Architecture Notes)
@@ -44,9 +44,8 @@ graph TD
 ## 4. 보안 고려사항 (Security)
 
 ### 프롬프트 인젝션 방지 (Anti-Injection)
-- 시스템 프롬프트에 `Strict Instruction` 추가 (이미 구현됨: `llm.py: get_rag_prompt`).
-- 입력 데이터 검증(Sanitization) 로직 추가.
-- `Post-Prompting` 기법을 사용하여 사용자 입력 후에 핵심 지침 재강조.
+- 시스템 프롬프트에 `Strict Instruction` 반영 (기초 단계 구현됨: `llm.py: get_rag_prompt`).
+- **미구현/추후 반영 예정:** 입력 데이터 검증(Sanitization) 및 `Post-Prompting` 기법을 사용한 핵심 지침 재강조 로직.
 
 ---
 > [!NOTE]

diff --git a/.agent/rules/security_guideline.md b/.agent/rules/security_guideline.md
@@ -19,19 +19,27 @@
 
 ## 2. 입력 데이터 정문화 (Input Sanitization)
 
-- **Markdown/Script Injection:** 사용자 입력에 포함된 `<script>`, `<iframe>` 등 위험한 HTML 태그를 제거합니다.
-- **Length Limiting:** 과도하게 긴 입력을 통한 서비스 거부(DoS) 공격을 방지하기 위해 입력 길이를 제한합니다.
+- **Markdown/Script Injection:** 사용자 입력에 포함된 `<script>`, `<iframe>` 등 위험한 HTML 태그를 정규식 기반으로 제거합니다.
+- **Length Limiting:** 
+  - **최대 입력 길이:** 2,000자 또는 2,048 토큰 (둘 중 하한값 적용).
+  - **제한 초과 시:** 사용자에게 즉시 오류 메시지(HTTP 400 - Bad Request)를 반환합니다.
 
 ## 3. 출력 데이터 검증 (Output Content Security)
 
 LLM이 생성한 결과물을 사용자에게 보여주기 전 다음 사항을 확인합니다.
-- **PII (개인정보) 필터링:** 주민등록번호, 이메일 주소 등이 노출되지 않도록 필터링합니다.
-- **Harmful Content:** 혐오 표현, 위험 정보 등이 포함되었는지 별도의 소형 모델이나 필터링 라이브러리를 통해 검증합니다.
+- **PII (개인정보) 필터링:** 
+  - **대상:** 이메일, 주민등록번호, 전화번호.
+  - **검증 규칙:** `Presidio` 라이브러리 및 표준 정규식(Regex)을 사용하여 탐지하고 `[MASK]` 처리합니다.
+- **Harmful Content:** 혐오 표현, 위험 정보 등이 포함되었는지 별도의 소형 모델이나 필터링 라이브러리를 통해 검증하며, 탐지 시 응답을 중단하고 HTTP 403 - Forbidden을 반환합니다.
 
 ## 4. API 보안 및 인프라
 
-- **Rate Limiting:** IP당/계정당 API 호출 횟수를 제한하여 무분별한 비용 발생 및 공격을 차단합니다.
+- **Rate Limiting:** 
+  - **Quotas:** IP당 분당 60회, 계정당 일일 1,000회 호출로 제한합니다.
+  - **Enforcement:** 한도 초과 시 `Retry-After` 헤더를 포함한 HTTP 429 - Too Many Requests를 반환합니다.
 - **API Key Management:** 환경 변수(`.env`)를 통해 관리하며, 절대 코드 저장소에 노출하지 않습니다.
+- **Logging Policy:** 
+  - 익명화된 인시던트 ID와 규칙 ID만 로깅하며, 원문 데이터(특히 PII)는 절대 로그에 남기지 않습니다.
 
 ---
 

diff --git a/.github/workflows/deploy.yml b/.github/workflows/deploy.yml
@@ -5,42 +5,59 @@ on:
     branches:
       - main
 
+# Prevent overlapping deployments but allow queuing
+concurrency:
+  group: "deploy-main"
+  cancel-in-progress: false
+
 env:
-  PROJECT_ID: 'vigilant-shift-490601-t5' # Using provided project ID
-  REGION: 'asia-northeast3' # Seoul region for better latency in KR
+  PROJECT_ID: 'vigilant-shift-490601-t5'
+  REGION: 'asia-northeast3'
   SERVICE_NAME: 'philo-rag-backend'
+  GAR_HOST: 'asia-northeast3-docker.pkg.dev'
+  IMAGE_NAME: 'philo-rag-backend'
+  # Constructs the full image URI for centralization
+  IMAGE_URI: 'asia-northeast3-docker.pkg.dev/vigilant-shift-490601-t5/cloud-run-source-deploy/philo-rag-backend:latest'
 
 jobs:
   deploy-backend:
     name: Build & Deploy Backend to Cloud Run
     runs-on: ubuntu-latest
+    permissions:
+      contents: 'read'
+      id-token: 'write'
+    outputs:
+      backend_url: ${{ steps.deploy.outputs.url }}
     steps:
       - name: Checkout
         uses: actions/checkout@v4
 
       - name: Google Auth
         uses: google-github-actions/auth@v2
         with:
-          credentials_json: ${{ secrets.GCP_SA_KEY }}
+          # Use Workload Identity Federation for better security
+          workload_identity_provider: ${{ secrets.GCP_WORKLOAD_IDENTITY_PROVIDER }}
+          service_account: ${{ secrets.GCP_SERVICE_ACCOUNT_EMAIL }}
 
       - name: Set up Cloud SDK
         uses: google-github-actions/setup-gcloud@v2
 
       - name: Authorize Docker
-        run: gcloud auth configure-docker asia-northeast3-docker.pkg.dev
+        run: gcloud auth configure-docker ${{ env.GAR_HOST }}
 
       - name: Build and Push Container
         working-directory: ./backend
         run: |-
-          docker build -t "asia-northeast3-docker.pkg.dev/${{ env.PROJECT_ID }}/cloud-run-source-deploy/${{ env.SERVICE_NAME }}:${{ github.sha }}" .
-          docker push "asia-northeast3-docker.pkg.dev/${{ env.PROJECT_ID }}/cloud-run-source-deploy/${{ env.SERVICE_NAME }}:${{ github.sha }}"
+          docker build -t "${{ env.IMAGE_URI }}" .
+          docker push "${{ env.IMAGE_URI }}"
 
       - name: Deploy to Cloud Run
+        id: deploy
         uses: google-github-actions/deploy-cloudrun@v2
         with:
           service: ${{ env.SERVICE_NAME }}
           region: ${{ env.REGION }}
-          image: asia-northeast3-docker.pkg.dev/${{ env.PROJECT_ID }}/cloud-run-source-deploy/${{ env.SERVICE_NAME }}:${{ github.sha }}
+          image: ${{ env.IMAGE_URI }}
           env_vars: |-
             OPENAI_API_KEY=${{ secrets.OPENAI_API_KEY }}
             SUPABASE_URL=${{ secrets.SUPABASE_URL }}
@@ -68,7 +85,8 @@ jobs:
       - name: Build Frontend
         working-directory: ./frontend
         env:
-          NEXT_PUBLIC_API_BASE_URL: ${{ secrets.NEXT_PUBLIC_API_BASE_URL }}
+          # Injects the actual deployed backend URL into the frontend build
+          NEXT_PUBLIC_API_BASE_URL: ${{ needs.deploy-backend.outputs.backend_url }}
         run: npm run build
 
       - name: Deploy to Firebase Hosting

diff --git a/README.md b/README.md
@@ -1,4 +1,4 @@
-# Philo-RAG (철학자와의 대화)
+# Philo-RAG (철학자와의 대화)
 > ⚠️ **안내사항 (Cold Start)**
 > 본 프로젝트의 백엔드 서버는 무료 클라우드 인스턴스에 배포되어 운영 중입니다. 일정 시간 요청이 없으면 서버가 휴면 상태로 전환되므로, **최초 접속 시 (Cold start) 백엔드 응답까지 약 1분 정도의 대기 시간이 발생**할 수 있습니다. 
 
@@ -221,5 +221,3 @@ npm run dev
 ```
 Open `http://localhost:3000` to start using the system.
 
-Open `http://localhost:3000` to start using the system.
-
diff --git a/backend/Dockerfile b/backend/Dockerfile
@@ -17,12 +17,22 @@ RUN apt-get update && apt-get install -y --no-install-recommends \
 COPY requirements.txt .
 RUN pip install --no-cache-dir -r requirements.txt
 
-# Copy application code
-COPY . .
+# Setup model cache directory for pre-loading
+RUN mkdir -p /app/model_cache && chmod 777 /app/model_cache
+ENV HF_HOME=/app/model_cache
+
+# Pre-load the embedding model to reduce cold start latency
+RUN python -c "from langchain_huggingface import HuggingFaceEmbeddings; HuggingFaceEmbeddings(model_name='sentence-transformers/all-MiniLM-L6-v2', model_kwargs={'device': 'cpu'})"
+
+# Create a non-privileged user for security
+RUN adduser --disabled-password --gecos "" appuser \
+    && chown -R appuser:appuser /app
+
+# Switch to non-privileged user
+USER appuser
 
 # Expose the port
 EXPOSE 8080
 
-# Command to run the application
-# We use 0.0.0.0 for Cloud Run
+# Command to run the application (0.0.0.0 for Cloud Run)
 CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8080", "--proxy-headers"]
diff --git a/backend/download_books.py b/backend/download_books.py
@@ -181,7 +181,13 @@ def main():
     os.makedirs(data_dir, exist_ok=True)
 
     downloaded_count = 0
-    target_count = int(os.getenv("TARGET_COUNT", "300"))
+    raw_target_count = os.getenv("TARGET_COUNT", "300")
+    try:
+        target_count = int(raw_target_count)
+    except ValueError as exc:
+        raise ValueError(f"TARGET_COUNT must be an integer, got '{raw_target_count}'") from exc
+    if target_count <= 0:
+        raise ValueError(f"TARGET_COUNT must be greater than 0, got {target_count}")
 
     current_url = shelf_url
 

diff --git a/backend/scripts/check_db.py b/backend/scripts/check_db.py
@@ -2,7 +2,7 @@
 import sys
 
 # Ensure we can import app modules
-sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
+sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 
 from app.services.database import get_client
 

diff --git a/backend/scripts/ingest_data.py b/backend/scripts/ingest_data.py
@@ -8,14 +8,14 @@
 from typing import List, Dict
 
 # Ensure we can import app modules
-sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
+sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 
 from app.core.config import settings
 from app.services.embedding import embedding_service
 from app.services.database import get_client
 from langchain.text_splitter import RecursiveCharacterTextSplitter
 
-supabase_client = get_client()
+# Supabase client will be initialized in ingest_document to avoid issues during import
 
 class IngestionError(Exception):
     """Raised when data ingestion fails."""
@@ -49,23 +49,32 @@ def generate_deterministic_uuid(seed_text: str) -> str:
     """Generates a consistent UUID based on the input text to ensure idempotency."""
     return str(uuid.uuid5(UUID_NAMESPACE, seed_text))
 
+import re
+
 def strip_gutenberg_boilerplate(text: str) -> str:
     """Removes Project Gutenberg START and END identifiers from the text."""
-    start_marker = "*** START OF THE PROJECT GUTENBERG EBOOK"
-    end_marker = "*** END OF THE PROJECT GUTENBERG EBOOK"
+    # Robust START and END markers derived from standard Gutenberg patterns
+    start_marker_regex = r"\*\*\* START OF (THE|THIS) PROJECT GUTENBERG EBOOK.*?\*\*\*"
+    end_marker_regex = r"\*\*\* END OF (THE|THIS) PROJECT GUTENBERG EBOOK.*?\*\*\*"
 
-    start_idx = text.upper().find(start_marker)
-    if start_idx != -1:
-        # Move past the marker line
-        newline_idx = text.find("\n", start_idx)
-        if newline_idx != -1:
-            text = text[newline_idx+1:]
+    # Find START marker
+    start_match = re.search(start_marker_regex, text, re.IGNORECASE | re.DOTALL)
+    if start_match:
+        text = text[start_match.end():]
 
-    end_idx = text.upper().find(end_marker)
-    if end_idx != -1:
-        text = text[:end_idx]
+    # Find END marker 
+    # Use standard marker or common fallback license starters
+    end_match = re.search(end_marker_regex, text, re.IGNORECASE | re.DOTALL)
+    if end_match:
+        text = text[:end_match.start()]
+    else:
+        # Fallback to searching for the full license block if marker is missing
+        license_starter = r"THE FULL PROJECT GUTENBERG™ LICENSE"
+        license_match = re.search(license_starter, text, re.IGNORECASE)
+        if license_match:
+            text = text[:license_match.start()]
 
-    return text
+    return text.strip()
 
 def generate_embedding_with_retry(text: str, max_retries: int = 3):
     """Wrapper to handle rate limiting and retries for the embedding API."""
@@ -87,6 +96,7 @@ def ingest_document(text: str, philosopher: str, school: str, book_title: str, l
     Chunks text, fetches metadata, generates embeddings via multiprocessing,
     and upserts to Supabase in batches with idempotency.
     """
+    supabase_client = get_client() # Lazy initialization
     print(f"Starting ingestion for {philosopher} - {book_title}")
 
     # 1. Fetch metadata

diff --git a/backend/scripts/update_metadata.py b/backend/scripts/update_metadata.py
@@ -100,14 +100,19 @@ def update_metadata():
             # Prepare batch data for atomicity (at book level)
             batch_data = []
             for doc in res.data:
+                # Merge new fields into nested book_info to maintain structure
+                new_metadata = doc["metadata"].copy()
+                book_info = new_metadata.get("book_info", {}).copy()
+                book_info.update({
+                    "kr_title": meta["kr_title"],
+                    "cover_url": meta["thumbnail"] or book_info.get("cover_url", ""),
+                    "link": meta["link"] or book_info.get("link", "")
+                })
+                new_metadata["book_info"] = book_info
+
                 batch_data.append({
                     "id": doc["id"],
-                    "metadata": {
-                        **doc["metadata"],
-                        "kr_title": meta["kr_title"],
-                        "thumbnail": meta["thumbnail"],
-                        "link": meta["link"]
-                    }
+                    "metadata": new_metadata
                 })
 
             if batch_data: