fix: handle Outlook HTML content in draft similarity calculation#1196
fix: handle Outlook HTML content in draft similarity calculation#1196
Conversation
- Update calculateSimilarity to accept ParsedMessage objects - Convert Outlook HTML to plain text before comparison (bodyContentType='html') - Use same normalization as isDraftUnmodified (stripQuotedContent, convertEmailHtmlToText) - Add integration tests for Outlook and Gmail scenarios - Fixes 0% match rate for Microsoft/Outlook drafts
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. 📝 WalkthroughWalkthroughThe PR refactors similarity calculation to accept full message/draft objects (including HTML) and adds normalization helpers for HTML-to-text conversion and quoted-content stripping. Draft-tracking callers were updated to pass full messages, and tests were expanded with integration-style cases for provider HTML and edge inputs. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
📜 Recent review detailsConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro 📒 Files selected for processing (2)
🧰 Additional context used📓 Path-based instructions (13)**/*.{ts,tsx}📄 CodeRabbit inference engine (.cursor/rules/data-fetching.mdc)
Files:
**/*.{ts,tsx,js,jsx}📄 CodeRabbit inference engine (.cursor/rules/prisma-enum-imports.mdc)
Files:
apps/web/**/*.{ts,tsx}📄 CodeRabbit inference engine (.cursor/rules/project-structure.mdc)
Files:
**/*.ts📄 CodeRabbit inference engine (.cursor/rules/security.mdc)
Files:
**/*.{tsx,ts}📄 CodeRabbit inference engine (.cursor/rules/ui-components.mdc)
Files:
**/*.{tsx,ts,css}📄 CodeRabbit inference engine (.cursor/rules/ui-components.mdc)
Files:
**/*.{js,jsx,ts,tsx}📄 CodeRabbit inference engine (.cursor/rules/ultracite.mdc)
Files:
!(pages/_document).{jsx,tsx}📄 CodeRabbit inference engine (.cursor/rules/ultracite.mdc)
Files:
**/*.{js,ts,jsx,tsx}📄 CodeRabbit inference engine (.cursor/rules/utilities.mdc)
Files:
**/{utils,helpers,lib}/**/*.{ts,tsx}📄 CodeRabbit inference engine (.cursor/rules/logging.mdc)
Files:
apps/web/**/*.{ts,tsx,js,jsx}📄 CodeRabbit inference engine (apps/web/CLAUDE.md)
Files:
apps/web/**/*.{ts,tsx,js,jsx,json,css}📄 CodeRabbit inference engine (apps/web/CLAUDE.md)
Files:
apps/web/**/*.{example,ts,json}📄 CodeRabbit inference engine (apps/web/CLAUDE.md)
Files:
🧠 Learnings (10)📚 Learning: 2025-11-25T14:37:22.660ZApplied to files:
📚 Learning: 2025-11-25T14:37:22.660ZApplied to files:
📚 Learning: 2025-11-25T14:38:07.606ZApplied to files:
📚 Learning: 2025-11-25T14:38:07.606ZApplied to files:
📚 Learning: 2025-11-25T14:42:08.869ZApplied to files:
📚 Learning: 2025-11-25T14:38:07.606ZApplied to files:
📚 Learning: 2025-11-25T14:42:08.869ZApplied to files:
📚 Learning: 2025-11-25T14:42:08.869ZApplied to files:
📚 Learning: 2025-11-25T14:42:08.869ZApplied to files:
📚 Learning: 2025-11-25T14:42:08.869ZApplied to files:
🧬 Code graph analysis (2)apps/web/utils/reply-tracker/draft-tracking.ts (1)
apps/web/utils/similarity-score.ts (3)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
🔇 Additional comments (5)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Handle Outlook HTML in draft similarity scoring by updating
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
apps/web/utils/similarity-score.test.ts (2)
1-11: AddbeforeEachto clear mocks between tests.According to coding guidelines, tests should include
beforeEach(() => { vi.clearAllMocks(); })to ensure test isolation. While the current tests use the real implementation, adding this would ensure consistency with project standards and prevent issues if mocks are added later.🔎 Suggested addition
describe("calculateSimilarity - basic tests", () => { + beforeEach(() => { + vi.clearAllMocks(); + }); + it("should return 0.0 if either text is null or undefined", () => {As per coding guidelines for
**/*.test.{ts,tsx}.
115-137: Consider extractingcreateParsedMessageto shared test helpers.The
createParsedMessagehelper is well-structured and could be useful for other test files that need to create mockParsedMessageobjects. Consider moving it to@/__tests__/helpers.tsfor reusability across the codebase, especially if other tests need to work withParsedMessageobjects.Based on learnings that recommend using existing helpers from
@/__tests__/helpers.ts.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
apps/web/utils/reply-tracker/draft-tracking.tsapps/web/utils/similarity-score.test.tsapps/web/utils/similarity-score.ts
🧰 Additional context used
📓 Path-based instructions (17)
**/*.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/data-fetching.mdc)
**/*.{ts,tsx}: For API GET requests to server, use theswrpackage
Useresult?.serverErrorwithtoastErrorfrom@/components/Toastfor error handling in async operations
**/*.{ts,tsx}: Use wrapper functions for Gmail message operations (get, list, batch, etc.) from @/utils/gmail/message.ts instead of direct API calls
Use wrapper functions for Gmail thread operations from @/utils/gmail/thread.ts instead of direct API calls
Use wrapper functions for Gmail label operations from @/utils/gmail/label.ts instead of direct API calls
**/*.{ts,tsx}: For early access feature flags, create hooks using the naming conventionuse[FeatureName]Enabledthat return a boolean fromuseFeatureFlagEnabled("flag-key")
For A/B test variant flags, create hooks using the naming conventionuse[FeatureName]Variantthat define variant types, useuseFeatureFlagVariantKey()with type casting, and provide a default "control" fallback
Use kebab-case for PostHog feature flag keys (e.g.,inbox-cleaner,pricing-options-2)
Always define types for A/B test variant flags (e.g.,type PricingVariant = "control" | "variant-a" | "variant-b") and provide type safety through type casting
**/*.{ts,tsx}: Don't use primitive type aliases or misleading types
Don't use empty type parameters in type aliases and interfaces
Don't use this and super in static contexts
Don't use any or unknown as type constraints
Don't use the TypeScript directive @ts-ignore
Don't use TypeScript enums
Don't export imported variables
Don't add type annotations to variables, parameters, and class properties that are initialized with literal expressions
Don't use TypeScript namespaces
Don't use non-null assertions with the!postfix operator
Don't use parameter properties in class constructors
Don't use user-defined types
Useas constinstead of literal types and type annotations
Use eitherT[]orArray<T>consistently
Initialize each enum member value explicitly
Useexport typefor types
Use `impo...
Files:
apps/web/utils/reply-tracker/draft-tracking.tsapps/web/utils/similarity-score.tsapps/web/utils/similarity-score.test.ts
**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (.cursor/rules/prisma-enum-imports.mdc)
Always import Prisma enums from
@/generated/prisma/enumsinstead of@/generated/prisma/clientto avoid Next.js bundling errors in client componentsImport Prisma using the project's centralized utility:
import prisma from '@/utils/prisma'
Files:
apps/web/utils/reply-tracker/draft-tracking.tsapps/web/utils/similarity-score.tsapps/web/utils/similarity-score.test.ts
apps/web/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/project-structure.mdc)
Import specific lodash functions rather than entire lodash library to minimize bundle size (e.g.,
import groupBy from 'lodash/groupBy')
apps/web/**/*.{ts,tsx}: Use TypeScript with strict null checks
Do not export types/interfaces that are only used within the same file. Export later if needed
Files:
apps/web/utils/reply-tracker/draft-tracking.tsapps/web/utils/similarity-score.tsapps/web/utils/similarity-score.test.ts
**/*.ts
📄 CodeRabbit inference engine (.cursor/rules/security.mdc)
**/*.ts: ALL database queries MUST be scoped to the authenticated user/account by including user/account filtering in WHERE clauses to prevent unauthorized data access
Always validate that resources belong to the authenticated user before performing operations, using ownership checks in WHERE clauses or relationships
Always validate all input parameters for type, format, and length before using them in database queries
Use SafeError for error responses to prevent information disclosure. Generic error messages should not reveal internal IDs, logic, or resource ownership details
Only return necessary fields in API responses using Prisma'sselectoption. Never expose sensitive data such as password hashes, private keys, or system flags
Prevent Insecure Direct Object References (IDOR) by validating resource ownership before operations. AllfindUnique/findFirstcalls MUST include ownership filters
Prevent mass assignment vulnerabilities by explicitly whitelisting allowed fields in update operations instead of accepting all user-provided data
Prevent privilege escalation by never allowing users to modify system fields, ownership fields, or admin-only attributes through user input
AllfindManyqueries MUST be scoped to the user's data by including appropriate WHERE filters to prevent returning data from other users
Use Prisma relationships for access control by leveraging nested where clauses (e.g.,emailAccount: { id: emailAccountId }) to validate ownership
Files:
apps/web/utils/reply-tracker/draft-tracking.tsapps/web/utils/similarity-score.tsapps/web/utils/similarity-score.test.ts
**/*.{tsx,ts}
📄 CodeRabbit inference engine (.cursor/rules/ui-components.mdc)
**/*.{tsx,ts}: Use Shadcn UI and Tailwind for components and styling
Usenext/imagepackage for images
For API GET requests to server, use theswrpackage with hooks likeuseSWRto fetch data
For text inputs, use theInputcomponent withregisterPropsfor form integration and error handling
Files:
apps/web/utils/reply-tracker/draft-tracking.tsapps/web/utils/similarity-score.tsapps/web/utils/similarity-score.test.ts
**/*.{tsx,ts,css}
📄 CodeRabbit inference engine (.cursor/rules/ui-components.mdc)
Implement responsive design with Tailwind CSS using a mobile-first approach
Files:
apps/web/utils/reply-tracker/draft-tracking.tsapps/web/utils/similarity-score.tsapps/web/utils/similarity-score.test.ts
**/*.{js,jsx,ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/ultracite.mdc)
**/*.{js,jsx,ts,tsx}: Don't useaccessKeyattribute on any HTML element
Don't setaria-hidden="true"on focusable elements
Don't add ARIA roles, states, and properties to elements that don't support them
Don't use distracting elements like<marquee>or<blink>
Only use thescopeprop on<th>elements
Don't assign non-interactive ARIA roles to interactive HTML elements
Make sure label elements have text content and are associated with an input
Don't assign interactive ARIA roles to non-interactive HTML elements
Don't assigntabIndexto non-interactive HTML elements
Don't use positive integers fortabIndexproperty
Don't include "image", "picture", or "photo" in img alt prop
Don't use explicit role property that's the same as the implicit/default role
Make static elements with click handlers use a valid role attribute
Always include atitleelement for SVG elements
Give all elements requiring alt text meaningful information for screen readers
Make sure anchors have content that's accessible to screen readers
AssigntabIndexto non-interactive HTML elements witharia-activedescendant
Include all required ARIA attributes for elements with ARIA roles
Make sure ARIA properties are valid for the element's supported roles
Always include atypeattribute for button elements
Make elements with interactive roles and handlers focusable
Give heading elements content that's accessible to screen readers (not hidden witharia-hidden)
Always include alangattribute on the html element
Always include atitleattribute for iframe elements
AccompanyonClickwith at least one of:onKeyUp,onKeyDown, oronKeyPress
AccompanyonMouseOver/onMouseOutwithonFocus/onBlur
Include caption tracks for audio and video elements
Use semantic elements instead of role attributes in JSX
Make sure all anchors are valid and navigable
Ensure all ARIA properties (aria-*) are valid
Use valid, non-abstract ARIA roles for elements with ARIA roles
Use valid AR...
Files:
apps/web/utils/reply-tracker/draft-tracking.tsapps/web/utils/similarity-score.tsapps/web/utils/similarity-score.test.ts
!(pages/_document).{jsx,tsx}
📄 CodeRabbit inference engine (.cursor/rules/ultracite.mdc)
Don't use the next/head module in pages/_document.js on Next.js projects
Files:
apps/web/utils/reply-tracker/draft-tracking.tsapps/web/utils/similarity-score.tsapps/web/utils/similarity-score.test.ts
**/*.{js,ts,jsx,tsx}
📄 CodeRabbit inference engine (.cursor/rules/utilities.mdc)
**/*.{js,ts,jsx,tsx}: Use lodash utilities for common operations (arrays, objects, strings)
Import specific lodash functions to minimize bundle size (e.g.,import groupBy from 'lodash/groupBy')
Files:
apps/web/utils/reply-tracker/draft-tracking.tsapps/web/utils/similarity-score.tsapps/web/utils/similarity-score.test.ts
**/{utils,helpers,lib}/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/logging.mdc)
Logger should be passed as a parameter to helper functions instead of creating their own logger instances
Files:
apps/web/utils/reply-tracker/draft-tracking.tsapps/web/utils/similarity-score.tsapps/web/utils/similarity-score.test.ts
apps/web/**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (apps/web/CLAUDE.md)
apps/web/**/*.{ts,tsx,js,jsx}: Use@/path aliases for imports from project root
Prefer self-documenting code over comments; use descriptive variable and function names instead of explaining intent with comments
Add helper functions to the bottom of files, not the top
All imports go at the top of files, no mid-file dynamic imports
Files:
apps/web/utils/reply-tracker/draft-tracking.tsapps/web/utils/similarity-score.tsapps/web/utils/similarity-score.test.ts
apps/web/**/*.{ts,tsx,js,jsx,json,css}
📄 CodeRabbit inference engine (apps/web/CLAUDE.md)
Format code with Prettier
Files:
apps/web/utils/reply-tracker/draft-tracking.tsapps/web/utils/similarity-score.tsapps/web/utils/similarity-score.test.ts
apps/web/**/*.{example,ts,json}
📄 CodeRabbit inference engine (apps/web/CLAUDE.md)
Add environment variables to
.env.example,env.ts, andturbo.json
Files:
apps/web/utils/reply-tracker/draft-tracking.tsapps/web/utils/similarity-score.tsapps/web/utils/similarity-score.test.ts
**/*.{test,spec}.{js,jsx,ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/ultracite.mdc)
**/*.{test,spec}.{js,jsx,ts,tsx}: Don't nest describe() blocks too deeply in test files
Don't use callbacks in asynchronous tests and hooks
Don't have duplicate hooks in describe blocks
Don't use export or module.exports in test files
Don't use focused tests
Make sure the assertion function, like expect, is placed inside an it() function call
Don't use disabled tests
Files:
apps/web/utils/similarity-score.test.ts
apps/web/**/*.test.{ts,tsx}
📄 CodeRabbit inference engine (apps/web/CLAUDE.md)
Co-locate test files next to source files (e.g.,
utils/example.test.ts). Only E2E and AI tests go in__tests__/
Files:
apps/web/utils/similarity-score.test.ts
**/*.test.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/testing.mdc)
**/*.test.{ts,tsx}: Usevitestas the testing framework
Colocate test files next to the tested file with.test.tsor.test.tsxnaming convention (e.g.,dir/format.tsanddir/format.test.ts)
Mockserver-onlyusingvi.mock("server-only", () => ({}))
Mock Prisma usingvi.mock("@/utils/prisma")and the provided mock from@/utils/__mocks__/prisma
Use test helper functionsgetEmail,getEmailAccount, andgetRulefrom@/__tests__/helpersfor creating mock data
Clear all mocks between tests usingbeforeEach(() => { vi.clearAllMocks(); })
Use descriptive test names that clearly indicate what is being tested
Do not mock the Logger in tests
Files:
apps/web/utils/similarity-score.test.ts
**/*.test.{js,jsx,ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/notes.mdc)
Co-locate test files next to source files (e.g.,
utils/example.test.ts). Only E2E and AI tests go in__tests__/
Files:
apps/web/utils/similarity-score.test.ts
🧠 Learnings (13)
📚 Learning: 2025-11-25T14:37:22.660Z
Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/gmail-api.mdc:0-0
Timestamp: 2025-11-25T14:37:22.660Z
Learning: Applies to **/*.{ts,tsx} : Use wrapper functions for Gmail thread operations from @/utils/gmail/thread.ts instead of direct API calls
Applied to files:
apps/web/utils/reply-tracker/draft-tracking.ts
📚 Learning: 2025-11-25T14:37:22.660Z
Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/gmail-api.mdc:0-0
Timestamp: 2025-11-25T14:37:22.660Z
Learning: Applies to **/*.{ts,tsx} : Use wrapper functions for Gmail message operations (get, list, batch, etc.) from @/utils/gmail/message.ts instead of direct API calls
Applied to files:
apps/web/utils/reply-tracker/draft-tracking.ts
📚 Learning: 2025-11-25T14:38:07.606Z
Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/llm.mdc:0-0
Timestamp: 2025-11-25T14:38:07.606Z
Learning: Applies to apps/web/utils/ai/**/*.ts : Use XML-like tags to structure data in prompts, remove excessive whitespace and truncate long inputs, and format data consistently across similar LLM functions
Applied to files:
apps/web/utils/similarity-score.ts
📚 Learning: 2025-11-25T14:38:07.606Z
Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/llm.mdc:0-0
Timestamp: 2025-11-25T14:38:07.606Z
Learning: Applies to apps/web/utils/ai/**/*.ts : Keep related AI functions in the same file or directory, extract common patterns into utility functions, and document complex AI logic with clear comments
Applied to files:
apps/web/utils/similarity-score.tsapps/web/utils/similarity-score.test.ts
📚 Learning: 2025-11-25T14:37:56.430Z
Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/llm-test.mdc:0-0
Timestamp: 2025-11-25T14:37:56.430Z
Learning: Applies to apps/web/__tests__/**/*.test.ts : Prefer using existing helpers from `@/__tests__/helpers.ts` (`getEmailAccount`, `getEmail`, `getRule`, `getMockMessage`, `getMockExecutedRule`) instead of creating custom test data helpers
Applied to files:
apps/web/utils/similarity-score.test.ts
📚 Learning: 2026-01-01T10:42:29.775Z
Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/testing.mdc:0-0
Timestamp: 2026-01-01T10:42:29.775Z
Learning: Applies to **/*.test.{ts,tsx} : Use test helper functions `getEmail`, `getEmailAccount`, and `getRule` from `@/__tests__/helpers` for creating mock data
Applied to files:
apps/web/utils/similarity-score.test.ts
📚 Learning: 2025-11-25T14:37:56.430Z
Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/llm-test.mdc:0-0
Timestamp: 2025-11-25T14:37:56.430Z
Learning: Applies to apps/web/__tests__/**/*.test.ts : Use vitest imports (`describe`, `expect`, `test`, `vi`, `beforeEach`) in LLM test files
Applied to files:
apps/web/utils/similarity-score.test.ts
📚 Learning: 2026-01-01T10:42:29.775Z
Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/testing.mdc:0-0
Timestamp: 2026-01-01T10:42:29.775Z
Learning: Applies to **/__tests__/**/*.{ts,tsx} : Place AI tests in the `__tests__` directory and do not run them by default as they use a real LLM
Applied to files:
apps/web/utils/similarity-score.test.ts
📚 Learning: 2026-01-01T10:42:29.775Z
Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/testing.mdc:0-0
Timestamp: 2026-01-01T10:42:29.775Z
Learning: Applies to **/*.test.{ts,tsx} : Use descriptive test names that clearly indicate what is being tested
Applied to files:
apps/web/utils/similarity-score.test.ts
📚 Learning: 2025-12-21T12:21:37.794Z
Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: apps/web/CLAUDE.md:0-0
Timestamp: 2025-12-21T12:21:37.794Z
Learning: Applies to apps/web/**/*.test.{ts,tsx} : Co-locate test files next to source files (e.g., `utils/example.test.ts`). Only E2E and AI tests go in `__tests__/`
Applied to files:
apps/web/utils/similarity-score.test.ts
📚 Learning: 2025-11-25T14:37:56.430Z
Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/llm-test.mdc:0-0
Timestamp: 2025-11-25T14:37:56.430Z
Learning: Applies to apps/web/__tests__/**/*.test.ts : Use `describe.runIf(isAiTest)` with environment variable `RUN_AI_TESTS === "true"` to conditionally run LLM tests
Applied to files:
apps/web/utils/similarity-score.test.ts
📚 Learning: 2025-11-25T14:37:56.430Z
Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/llm-test.mdc:0-0
Timestamp: 2025-11-25T14:37:56.430Z
Learning: Applies to apps/web/__tests__/**/*.test.ts : Use `console.debug()` for outputting generated LLM content in tests, e.g., `console.debug("Generated content:\n", result.content);`
Applied to files:
apps/web/utils/similarity-score.test.ts
📚 Learning: 2026-01-01T10:42:29.775Z
Learnt from: CR
Repo: elie222/inbox-zero PR: 0
File: .cursor/rules/testing.mdc:0-0
Timestamp: 2026-01-01T10:42:29.775Z
Learning: Applies to **/*.test.{ts,tsx} : Clear all mocks between tests using `beforeEach(() => { vi.clearAllMocks(); })`
Applied to files:
apps/web/utils/similarity-score.test.ts
🧬 Code graph analysis (3)
apps/web/utils/reply-tracker/draft-tracking.ts (1)
apps/web/utils/similarity-score.ts (1)
calculateSimilarity(47-78)
apps/web/utils/similarity-score.ts (3)
apps/web/utils/mail.ts (2)
convertEmailHtmlToText(105-126)parseReply(9-13)apps/web/utils/ai/choose-rule/draft-management.ts (1)
stripQuotedContent(140-158)apps/web/utils/types.ts (1)
ParsedMessage(51-73)
apps/web/utils/similarity-score.test.ts (1)
apps/web/utils/similarity-score.ts (1)
calculateSimilarity(47-78)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Baz Reviewer
- GitHub Check: Macroscope - Correctness Check
- GitHub Check: test
- GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (8)
apps/web/utils/similarity-score.test.ts (3)
13-97: Excellent test coverage for basic scenarios.The basic tests comprehensively cover edge cases including identical texts, case-insensitivity, whitespace normalization, empty inputs, and realistic email content. The test descriptions are clear and specific.
139-174: Strong integration tests for Outlook HTML content handling.These tests directly address the core issue described in the PR objectives: handling Outlook's HTML content in draft similarity calculations. The scenarios cover plain text comparison with HTML responses, signature handling, and quoted content stripping—all critical for fixing the 0% match rate issue mentioned for Microsoft Outlook users.
176-242: Comprehensive coverage of Gmail, cross-platform scenarios, and backwards compatibility.The integration tests effectively validate:
- Gmail quoted content stripping
- Cross-platform newline handling (
\nvs\r\n)- Real-world sent email tracking scenarios
- Backwards compatibility with plain string arguments
This ensures the refactoring doesn't break existing functionality while adding Outlook support.
apps/web/utils/similarity-score.ts (3)
13-20: LGTM: Correct normalization for stored content.The normalization logic correctly implements the PR objective by converting newlines to
<br>tags (matching stored format), converting to plain text, stripping quoted content, and normalizing case/whitespace. This aligns with howisDraftUnmodifiedhandles content comparison.
26-37: LGTM: Correct Outlook HTML handling.This function addresses the core issue described in the PR objectives. When
bodyContentType === 'html'(Outlook case), it properly converts HTML to plain text before comparison. The normalization steps (strip quotes, lowercase, trim) are consistent withnormalizeStoredContent, ensuring accurate similarity calculation.
47-78: LGTM: Well-structured refactoring with backwards compatibility.The updated
calculateSimilarityfunction correctly:
- Maintains backwards compatibility by accepting both
stringandParsedMessageas the second parameter- Uses the appropriate normalization path based on input type
- Handles edge cases (null, undefined, empty strings) gracefully
- Includes clear comments explaining the backwards compatibility approach
The implementation directly addresses the PR objectives by enabling proper Outlook HTML content handling while preserving existing Gmail functionality.
apps/web/utils/reply-tracker/draft-tracking.ts (2)
22-83: LGTM: Correct integration with updatedcalculateSimilarity.The changes properly adapt to the refactored
calculateSimilarityfunction:
- Line 22: Added
sentMessageIddestructuring (used later for logging)- Line 83: Now passes the full
messageobject instead of justtextPlain, enabling proper Outlook HTML content handling viabodyContentTypeThe explanatory comment on line 82 clearly documents the purpose of passing the full message object.
169-173: LGTM: Consistent integration in draft cleanup flow.The change to pass the full
draftDetailsobject (instead of justtextPlain) is consistent with thetrackSentDraftStatuschanges and enables proper Outlook HTML content handling during draft cleanup. The comment on line 169 clearly documents the purpose.
apps/web/utils/similarity-score.ts
Outdated
| // ParsedMessage - use proper normalization with bodyContentType | ||
| const text = providerMessage.textPlain || ""; | ||
| normalized2 = normalizeProviderContent( | ||
| text, | ||
| providerMessage.bodyContentType, | ||
| ); |
There was a problem hiding this comment.
When textPlain is empty (common in Gmail/Outlook HTML-only replies), the similarity check compares against an empty string and never detects unmodified drafts. Should we fall back to textHtml?
| // ParsedMessage - use proper normalization with bodyContentType | |
| const text = providerMessage.textPlain || ""; | |
| normalized2 = normalizeProviderContent( | |
| text, | |
| providerMessage.bodyContentType, | |
| ); | |
| const text = providerMessage.textPlain || providerMessage.textHtml || ""; | |
| const contentType = providerMessage.textPlain ? providerMessage.bodyContentType : "html"; | |
| normalized2 = normalizeProviderContent( | |
| text, | |
| contentType, | |
| ); |
Finding types: AI Coding Guidelines Type Inconsistency
There was a problem hiding this comment.
Commit 2bf832a addressed this comment. The code now falls back to providerMessage.textHtml when textPlain is empty, and dynamically sets the content type to "html" in such cases. This implementation matches the reviewer's suggestion and solves the issue where HTML-only emails would result in empty string comparisons.
There was a problem hiding this comment.
Good catch! Implemented the fallback to textHtml when textPlain is empty, treating it as HTML content.
- Remove unused extractDraftPlainText import - Add fallback to textHtml when textPlain is empty (handles HTML-only emails) - Update condition to check for textPlain OR textHtml
User description
Fixes draft matching/deletion failing for Microsoft Outlook users.
TLDR: The
calculateSimilarityfunction wasn't handling Outlook HTML content, causing 0% match rate for Microsoft users vs 21% for Gmail.Problem
Root Cause
When
bodyContentType === 'html'(Outlook), thetextPlainfield contains HTML that needs to be converted to plain text. The oldcalculateSimilaritypassed this directly toparseReply, which doesn't handle HTML tags.Changes
calculateSimilarityto acceptParsedMessageobjects (with backwards compatibility for plain strings)isDraftUnmodified:convertEmailHtmlToText()+stripQuotedContent()ParsedMessagefromdraft-tracking.tsinstead of justtextPlainFiles Changed
apps/web/utils/similarity-score.ts- Updated function signature and normalizationapps/web/utils/reply-tracker/draft-tracking.ts- Pass full ParsedMessageapps/web/utils/similarity-score.test.ts- Added integration testsGenerated description
Below is a concise technical summary of the changes proposed in this PR:
graph LR trackSentDraftStatus_("trackSentDraftStatus"):::modified calculateSimilarity_("calculateSimilarity"):::modified cleanupThreadAIDrafts_("cleanupThreadAIDrafts"):::modified normalizeStoredContent_("normalizeStoredContent"):::added normalizeProviderContent_("normalizeProviderContent"):::added CONVERT_EMAIL_HTML_TO_TEXT_("CONVERT_EMAIL_HTML_TO_TEXT"):::added STRIP_QUOTED_CONTENT_("STRIP_QUOTED_CONTENT"):::added STRING_SIMILARITY_("STRING_SIMILARITY"):::modified trackSentDraftStatus_ -- "Now passes full message object to handle Outlook HTML." --> calculateSimilarity_ cleanupThreadAIDrafts_ -- "Now passes full draftDetails to handle HTML and quoted content." --> calculateSimilarity_ calculateSimilarity_ -- "Normalizes stored drafts via HTML-to-text and strip quoted." --> normalizeStoredContent_ calculateSimilarity_ -- "Normalizes provider messages, handling HTML and stripping quoted." --> normalizeProviderContent_ normalizeStoredContent_ -- "Converts stored draft HTML (<br>) to plain text." --> CONVERT_EMAIL_HTML_TO_TEXT_ normalizeStoredContent_ -- "Removes quoted reply blocks from stored draft content." --> STRIP_QUOTED_CONTENT_ normalizeProviderContent_ -- "Converts Outlook HTML provider bodies to plain text." --> CONVERT_EMAIL_HTML_TO_TEXT_ normalizeProviderContent_ -- "Strips quoted replies from provider message content." --> STRIP_QUOTED_CONTENT_ calculateSimilarity_ -- "Compares normalized texts with string-similarity for score." --> STRING_SIMILARITY_ classDef added stroke:#15AA7A classDef removed stroke:#CD5270 classDef modified stroke:#EDAC4C linkStyle default stroke:#CBD5E1,font-size:13pxUpdates the
calculateSimilarityfunction to properly handle HTML content from Outlook and other email providers, resolving issues where draft matching and deletion failed due to incorrect similarity scores. Modifiesdraft-tracking.tsto pass fullParsedMessageobjects to the updated similarity calculation logic.Latest Contributors(2)
Summary by CodeRabbit
Tests
Refactor
✏️ Tip: You can customize this high-level summary in your review settings.