Skip to content

Comments

Digest improvements: Exclude meaningless information#611

Merged
elie222 merged 1 commit intoelie222:mainfrom
edulelis:digest-emails-v13
Jul 31, 2025
Merged

Digest improvements: Exclude meaningless information#611
elie222 merged 1 commit intoelie222:mainfrom
edulelis:digest-emails-v13

Conversation

@edulelis
Copy link
Collaborator

@edulelis edulelis commented Jul 31, 2025

Summary by CodeRabbit

  • New Features
    • Improved email summarization by ensuring only human-relevant and readable information is included in structured classifications, excluding technical identifiers and opaque strings.

@vercel
Copy link

vercel bot commented Jul 31, 2025

@edulelis is attempting to deploy a commit to the Inbox Zero OSS Program Team on Vercel.

A member of the Team first needs to authorize it.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jul 31, 2025

Walkthrough

A new section, "Content rules for structured classification," was added to the AI system prompt for email summarization. This section instructs the AI to include only human-relevant, human-readable information in structured summaries and to exclude technical identifiers like account IDs or tracking tokens. No other logic or control flow was changed.

Changes

Cohort / File(s) Change Summary
AI Summarization Prompt Update
apps/web/utils/ai/digest/summarize-email-for-digest.ts
Added explicit content rules to the system prompt, guiding the AI to exclude technical identifiers and focus on user-relevant information in structured summaries.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

Possibly related PRs

Poem

In the warren where summaries grow,
We teach our AI what to show:
No account IDs or tokens long,
Only what helps humans along!
With prompts refined and rules anew,
Our summaries hop closer to you.
🐇✨

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🔭 Outside diff range comments (1)
apps/web/utils/ai/digest/summarize-email-for-digest.ts (1)

71-72: Avoid logging full prompt ‒ potential PII/PCI exposure

logger.trace("Input", { system, prompt }); records the entire e-mail body and rule name verbatim. Digest e-mails can contain personal data, so persisting them in logs risks accidental retention or leakage.

Minimal fix: log only metadata or a truncated/hash version of the content.

-logger.trace("Input", { system, prompt });
+logger.trace("Input", {
+  systemPreview: system.slice(0, 120),
+  promptPreview: prompt.slice(0, 200), // truncate to avoid PII retention
+});
🧹 Nitpick comments (1)
apps/web/utils/ai/digest/summarize-email-for-digest.ts (1)

52-55: Clarify & extend the new “Content rules” to guard against PII leakage

Nice addition! Consider tightening the guidance further so the model also omits personal identifiers that are human-readable but still sensitive (e-mail addresses, phone numbers, street addresses, credit-card PANs, etc.). This keeps the digest privacy-safe and consistent with GDPR/CCPA expectations.

 **Content rules for structured classification:**
 - Only include human-relevant and human-readable information.
 - Exclude opaque technical identifiers like account IDs, payment IDs, tracking tokens, or long alphanumeric strings that aren't meaningful to users.
+ - Exclude or redact sensitive personal data such as e-mail addresses, phone numbers, street addresses, or full credit-card numbers.
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 615e521 and 9cac0ce.

📒 Files selected for processing (1)
  • apps/web/utils/ai/digest/summarize-email-for-digest.ts (1 hunks)
🧰 Additional context used
📓 Path-based instructions (10)
apps/web/**/*.{ts,tsx}

📄 CodeRabbit Inference Engine (apps/web/CLAUDE.md)

apps/web/**/*.{ts,tsx}: Use TypeScript with strict null checks
Path aliases: Use @/ for imports from project root
Use proper error handling with try/catch blocks
Format code with Prettier
Leverage TypeScript inference for better DX

Files:

  • apps/web/utils/ai/digest/summarize-email-for-digest.ts
!{.cursor/rules/*.mdc}

📄 CodeRabbit Inference Engine (.cursor/rules/cursor-rules.mdc)

Never place rule files in the project root, in subdirectories outside .cursor/rules, or in any other location

Files:

  • apps/web/utils/ai/digest/summarize-email-for-digest.ts
**/*.ts

📄 CodeRabbit Inference Engine (.cursor/rules/form-handling.mdc)

**/*.ts: The same validation should be done in the server action too
Define validation schemas using Zod

Files:

  • apps/web/utils/ai/digest/summarize-email-for-digest.ts
apps/web/utils/{ai,llms}/**/*

📄 CodeRabbit Inference Engine (.cursor/rules/llm.mdc)

apps/web/utils/{ai,llms}/**/*: LLM-related code must be organized in the directories: apps/web/utils/ai/, apps/web/utils/llms/, and apps/web/tests/ for LLM-specific tests.
Keep related AI functions in the same file or directory.

Files:

  • apps/web/utils/ai/digest/summarize-email-for-digest.ts
apps/web/utils/{ai,llms}/**/*.ts

📄 CodeRabbit Inference Engine (.cursor/rules/llm.mdc)

apps/web/utils/{ai,llms}/**/*.ts: Follow the standard structure for LLM-related functions: use a scoped logger, define a Zod schema for output, validate inputs early, separate system and user prompts, log inputs and outputs, call chatCompletionObject with proper configuration, and return validated results.
Keep system prompts and user prompts separate in LLM-related code.
System prompt should define the LLM's role and task specifications.
User prompt should contain the actual data and context.
Always define a Zod schema for response validation in LLM-related functions.
Make Zod schemas as specific as possible to guide the LLM output.
Use descriptive scoped loggers for each LLM feature.
Log inputs and outputs with appropriate log levels in LLM-related functions.
Include relevant context in log messages for LLM-related code.
Implement early returns for invalid inputs in LLM-related functions.
Use proper error types and logging in LLM-related code.
Implement fallbacks for AI failures in LLM-related functions.
Add retry logic for transient failures using withRetry in LLM-related code.
Use XML-like tags to structure data in LLM prompts.
Remove excessive whitespace and truncate long inputs in LLM prompts.
Format data consistently across similar LLM-related functions.
Use TypeScript types for all parameters and return values in LLM-related code.
Define clear interfaces for complex input/output structures in LLM-related code.
Extract common patterns into utility functions in LLM-related code.
Document complex AI logic with clear comments in LLM-related code.

Files:

  • apps/web/utils/ai/digest/summarize-email-for-digest.ts
**/*.{ts,tsx}

📄 CodeRabbit Inference Engine (.cursor/rules/logging.mdc)

**/*.{ts,tsx}: Use createScopedLogger for logging in backend TypeScript files
Typically add the logger initialization at the top of the file when using createScopedLogger
Only use .with() on a logger instance within a specific function, not for a global logger

Import Prisma in the project using import prisma from "@/utils/prisma";

**/*.{ts,tsx}: Don't use TypeScript enums.
Don't use TypeScript const enum.
Don't use the TypeScript directive @ts-ignore.
Don't use primitive type aliases or misleading types.
Don't use empty type parameters in type aliases and interfaces.
Don't use any or unknown as type constraints.
Don't use implicit any type on variable declarations.
Don't let variables evolve into any type through reassignments.
Don't use non-null assertions with the ! postfix operator.
Don't misuse the non-null assertion operator (!) in TypeScript files.
Don't use user-defined types.
Use as const instead of literal types and type annotations.
Use export type for types.
Use import type for types.
Don't declare empty interfaces.
Don't merge interfaces and classes unsafely.
Don't use overload signatures that aren't next to each other.
Use the namespace keyword instead of the module keyword to declare TypeScript namespaces.
Don't use TypeScript namespaces.
Don't export imported variables.
Don't add type annotations to variables, parameters, and class properties that are initialized with literal expressions.
Don't use parameter properties in class constructors.
Use either T[] or Array consistently.
Initialize each enum member value explicitly.
Make sure all enum members are literal values.

Files:

  • apps/web/utils/ai/digest/summarize-email-for-digest.ts
apps/web/utils/**

📄 CodeRabbit Inference Engine (.cursor/rules/project-structure.mdc)

Create utility functions in utils/ folder for reusable logic

Files:

  • apps/web/utils/ai/digest/summarize-email-for-digest.ts
apps/web/utils/**/*.ts

📄 CodeRabbit Inference Engine (.cursor/rules/project-structure.mdc)

apps/web/utils/**/*.ts: Use lodash utilities for common operations (arrays, objects, strings)
Import specific lodash functions to minimize bundle size

Files:

  • apps/web/utils/ai/digest/summarize-email-for-digest.ts
**/*.{js,jsx,ts,tsx}

📄 CodeRabbit Inference Engine (.cursor/rules/ultracite.mdc)

**/*.{js,jsx,ts,tsx}: Don't use elements in Next.js projects.
Don't use elements in Next.js projects.
Don't use namespace imports.
Don't access namespace imports dynamically.
Don't use global eval().
Don't use console.
Don't use debugger.
Don't use var.
Don't use with statements in non-strict contexts.
Don't use the arguments object.
Don't use consecutive spaces in regular expression literals.
Don't use the comma operator.
Don't use unnecessary boolean casts.
Don't use unnecessary callbacks with flatMap.
Use for...of statements instead of Array.forEach.
Don't create classes that only have static members (like a static namespace).
Don't use this and super in static contexts.
Don't use unnecessary catch clauses.
Don't use unnecessary constructors.
Don't use unnecessary continue statements.
Don't export empty modules that don't change anything.
Don't use unnecessary escape sequences in regular expression literals.
Don't use unnecessary labels.
Don't use unnecessary nested block statements.
Don't rename imports, exports, and destructured assignments to the same name.
Don't use unnecessary string or template literal concatenation.
Don't use String.raw in template literals when there are no escape sequences.
Don't use useless case statements in switch statements.
Don't use ternary operators when simpler alternatives exist.
Don't use useless this aliasing.
Don't initialize variables to undefined.
Don't use the void operators (they're not familiar).
Use arrow functions instead of function expressions.
Use Date.now() to get milliseconds since the Unix Epoch.
Use .flatMap() instead of map().flat() when possible.
Use literal property access instead of computed property access.
Don't use parseInt() or Number.parseInt() when binary, octal, or hexadecimal literals work.
Use concise optional chaining instead of chained logical expressions.
Use regular expression literals instead of the RegExp constructor when possible.
Don't use number literal object member names th...

Files:

  • apps/web/utils/ai/digest/summarize-email-for-digest.ts
!pages/_document.{js,jsx,ts,tsx}

📄 CodeRabbit Inference Engine (.cursor/rules/ultracite.mdc)

!pages/_document.{js,jsx,ts,tsx}: Don't import next/document outside of pages/_document.jsx in Next.js projects.
Don't import next/document outside of pages/_document.jsx in Next.js projects.

Files:

  • apps/web/utils/ai/digest/summarize-email-for-digest.ts
🧠 Learnings (11)
📓 Common learnings
Learnt from: CR
PR: elie222/inbox-zero#0
File: .cursor/rules/llm.mdc:0-0
Timestamp: 2025-07-18T15:06:40.272Z
Learning: Applies to apps/web/utils/{ai,llms}/**/*.ts : Document complex AI logic with clear comments in LLM-related code.
📚 Learning: applies to apps/web/utils/{ai,llms}/**/*.ts : document complex ai logic with clear comments in llm-r...
Learnt from: CR
PR: elie222/inbox-zero#0
File: .cursor/rules/llm.mdc:0-0
Timestamp: 2025-07-18T15:06:40.272Z
Learning: Applies to apps/web/utils/{ai,llms}/**/*.ts : Document complex AI logic with clear comments in LLM-related code.

Applied to files:

  • apps/web/utils/ai/digest/summarize-email-for-digest.ts
📚 Learning: applies to apps/web/utils/{ai,llms}/**/*.ts : keep system prompts and user prompts separate in llm-r...
Learnt from: CR
PR: elie222/inbox-zero#0
File: .cursor/rules/llm.mdc:0-0
Timestamp: 2025-07-18T15:06:40.272Z
Learning: Applies to apps/web/utils/{ai,llms}/**/*.ts : Keep system prompts and user prompts separate in LLM-related code.

Applied to files:

  • apps/web/utils/ai/digest/summarize-email-for-digest.ts
📚 Learning: applies to apps/web/utils/{ai,llms}/**/*.ts : include relevant context in log messages for llm-relat...
Learnt from: CR
PR: elie222/inbox-zero#0
File: .cursor/rules/llm.mdc:0-0
Timestamp: 2025-07-18T15:06:40.272Z
Learning: Applies to apps/web/utils/{ai,llms}/**/*.ts : Include relevant context in log messages for LLM-related code.

Applied to files:

  • apps/web/utils/ai/digest/summarize-email-for-digest.ts
📚 Learning: in packages/resend/emails/digest.tsx, the digestemailprops type uses `[key: string]: digestitem[] | ...
Learnt from: edulelis
PR: elie222/inbox-zero#576
File: packages/resend/emails/digest.tsx:78-83
Timestamp: 2025-07-17T04:19:57.099Z
Learning: In packages/resend/emails/digest.tsx, the DigestEmailProps type uses `[key: string]: DigestItem[] | undefined | string | Date | undefined` instead of intersection types like `& Record<string, DigestItem[] | undefined>` due to implementation constraints. This was the initial implementation approach and cannot be changed to more restrictive typing.

Applied to files:

  • apps/web/utils/ai/digest/summarize-email-for-digest.ts
📚 Learning: applies to apps/web/utils/{ai,llms}/**/*.ts : use xml-like tags to structure data in llm prompts....
Learnt from: CR
PR: elie222/inbox-zero#0
File: .cursor/rules/llm.mdc:0-0
Timestamp: 2025-07-18T15:06:40.272Z
Learning: Applies to apps/web/utils/{ai,llms}/**/*.ts : Use XML-like tags to structure data in LLM prompts.

Applied to files:

  • apps/web/utils/ai/digest/summarize-email-for-digest.ts
📚 Learning: applies to apps/web/utils/{ai,llms}/**/*.ts : user prompt should contain the actual data and context...
Learnt from: CR
PR: elie222/inbox-zero#0
File: .cursor/rules/llm.mdc:0-0
Timestamp: 2025-07-18T15:06:40.272Z
Learning: Applies to apps/web/utils/{ai,llms}/**/*.ts : User prompt should contain the actual data and context.

Applied to files:

  • apps/web/utils/ai/digest/summarize-email-for-digest.ts
📚 Learning: applies to apps/web/utils/{ai,llms}/**/*.ts : system prompt should define the llm's role and task sp...
Learnt from: CR
PR: elie222/inbox-zero#0
File: .cursor/rules/llm.mdc:0-0
Timestamp: 2025-07-18T15:06:40.272Z
Learning: Applies to apps/web/utils/{ai,llms}/**/*.ts : System prompt should define the LLM's role and task specifications.

Applied to files:

  • apps/web/utils/ai/digest/summarize-email-for-digest.ts
📚 Learning: applies to apps/web/utils/{ai,llms}/**/*.ts : follow the standard structure for llm-related function...
Learnt from: CR
PR: elie222/inbox-zero#0
File: .cursor/rules/llm.mdc:0-0
Timestamp: 2025-07-18T15:06:40.272Z
Learning: Applies to apps/web/utils/{ai,llms}/**/*.ts : Follow the standard structure for LLM-related functions: use a scoped logger, define a Zod schema for output, validate inputs early, separate system and user prompts, log inputs and outputs, call chatCompletionObject with proper configuration, and return validated results.

Applied to files:

  • apps/web/utils/ai/digest/summarize-email-for-digest.ts
📚 Learning: applies to apps/web/utils/{ai,llms}/**/*.ts : make zod schemas as specific as possible to guide the ...
Learnt from: CR
PR: elie222/inbox-zero#0
File: .cursor/rules/llm.mdc:0-0
Timestamp: 2025-07-18T15:06:40.272Z
Learning: Applies to apps/web/utils/{ai,llms}/**/*.ts : Make Zod schemas as specific as possible to guide the LLM output.

Applied to files:

  • apps/web/utils/ai/digest/summarize-email-for-digest.ts
📚 Learning: applies to apps/web/utils/{ai,llms}/**/*.ts : define clear interfaces for complex input/output struc...
Learnt from: CR
PR: elie222/inbox-zero#0
File: .cursor/rules/llm.mdc:0-0
Timestamp: 2025-07-18T15:06:40.272Z
Learning: Applies to apps/web/utils/{ai,llms}/**/*.ts : Define clear interfaces for complex input/output structures in LLM-related code.

Applied to files:

  • apps/web/utils/ai/digest/summarize-email-for-digest.ts
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Static Code Analysis Js
  • GitHub Check: Jit Security
🔇 Additional comments (1)
apps/web/utils/ai/digest/summarize-email-for-digest.ts (1)

64-69: Consider pre-emptive token/size control for large mails

stringifyEmailSimple may return very long strings, pushing the request over model context limits or incurring extra cost. A simple guard such as truncateMiddle() (keeping head & tail) before interpolation would improve robustness.

No action required if stringifyEmailSimple already handles this, but worth verifying.

@elie222 elie222 merged commit 53c55e7 into elie222:main Jul 31, 2025
9 of 11 checks passed
@coderabbitai coderabbitai bot mentioned this pull request Aug 24, 2025
@edulelis edulelis deleted the digest-emails-v13 branch August 29, 2025 14:37
@coderabbitai coderabbitai bot mentioned this pull request Sep 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants