Skip to content

Orama search relevance improvements#33

Merged
Shunseii merged 7 commits intomainfrom
refactor/two-pass-orama-search
Dec 30, 2025
Merged

Orama search relevance improvements#33
Shunseii merged 7 commits intomainfrom
refactor/two-pass-orama-search

Conversation

@Shunseii
Copy link
Copy Markdown
Owner

@Shunseii Shunseii commented Dec 29, 2025

Summary by CodeRabbit

  • New Features

    • New tag input component with async search, creation, and keyboard accessibility
    • Persisted "suggested tags" stored and surfaced when adding entries
  • Improvements

    • Search enhanced with morphology-aware fields (singular/plural, verb forms) and exact-field support
    • Two-pass exact+fuzzy search with improved ranking and pagination
    • Arabic normalization refined to preserve word distinctiveness and exact-token handling

✏️ Tip: You can customize this high-level summary in your review settings.

@Shunseii Shunseii self-assigned this Dec 29, 2025
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Dec 29, 2025

📝 Walkthrough

Walkthrough

This pull request enriches search documents with exact and normalized morphology fields, implements a two-pass exact+fuzzy search in the search library, adapts tokenization for exact Arabic fields, updates hydrate/rehydrate document transformation across web/mobile, refactors web search hooks to use the centralized search, and adds a session-backed tag combobox and suggested-tags atom.

Changes

Cohort / File(s) Summary
Search Core Infrastructure
packages/search/src/database.ts, packages/search/src/schema.ts, packages/search/src/tokenizer.ts, packages/search/src/arabic.ts
Adds two-pass search (EXACT_PROPERTIES then NORMALIZED_PROPERTIES) with boosting and language param; extends schema with flattened morphology and exact variants plus word_exact; adds arabicExactTokenizer and exact-field tokenization; removes weak-letter normalization from primary Arabic normalization.
Search Document Transformation
apps/mobile/src/lib/search/index.ts, apps/web/src/lib/search/index.ts
Computes and attaches word_exact and a structured morphology (ism/verb normalized and exact variants) during hydrate and rehydrate; adds toOramaDocument() in web for transforming SelectDictionaryEntry -> DictionaryDocument.
Web Search Hooks & Refactoring
apps/web/src/hooks/useSearch.ts
Replaces SelectDictionaryEntry with DictionaryDocument everywhere, simplifies search params to { term?, offset? }, and switches hook implementation to call searchDictionary(...) (language-aware). Updates atoms and return types to DictionaryDocument.
Tokenizer / Tokenization Props
packages/search/src/tokenizer.ts
Adds EXACT_PROPS listing exact-field names, uses stripArabicDiacritics + arabicExactTokenizer for those fields, exports arabicExactTokenizer.
Tags UI & State
apps/web/src/atoms/suggested-tags.ts, apps/web/src/components/TagsCombobox.tsx, apps/web/src/components/features/dictionary/add/TagsFormSection.tsx
Introduces suggestedTagsAtom (sessionStorage-backed), new generic TagsCombobox<T> component (debounced async queries, keyboard navigation, optional creation), and refactors TagsFormSection to use the combobox and suggested-tags flow.
DB Integration & Indexing
apps/web/src/hooks/db/index.ts
Persists suggested tags after add, and replaces raw insert/update into Orama with toOramaDocument(...) to ensure word_exact and morphology are indexed.
Utilities & Config
apps/web/src/lib/utils.ts, biome.jsonc
Removed two eslint-disable comments in nullToUndefined; disabled two biome lint rules (useAtIndex, useDefaultSwitchClause) by setting them to "off".

Sequence Diagram(s)

sequenceDiagram
    participant Client as useSearch Hook / UI
    participant SearchFn as searchDictionary()
    participant ExactPass as Exact Pass<br/>(EXACT_PROPERTIES)
    participant FuzzyPass as Normalized Pass<br/>(NORMALIZED_PROPERTIES + BOOST)
    participant Merge as Merge & Deduplicate

    Client->>SearchFn: search(term, { limit?, offset?, language? })
    alt term is empty
        SearchFn->>FuzzyPass: single search (normalized)
        FuzzyPass-->>SearchFn: fuzzy results
    else
        SearchFn->>ExactPass: first pass (exact fields, low tolerance)
        ExactPass-->>SearchFn: exact results
        SearchFn->>FuzzyPass: second pass (normalized fields, higher tolerance + boost)
        FuzzyPass-->>SearchFn: fuzzy results
    end
    SearchFn->>Merge: combine exact + fuzzy
    Merge->>Merge: deduplicate by id, apply offset/limit
    Merge-->>SearchFn: paginated results
    SearchFn-->>Client: Results<DictionaryDocument> (includes word_exact & morphology)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐇 I hopped through words both soft and exact,
I nudged each form so searches act —
Singular, plural, past and now,
Tags remembered—take a bow! ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Orama search relevance improvements' directly captures the main objective of the PR, which implements a two-pass search strategy with enhanced relevance features across multiple search-related files.
✨ Finishing touches
  • 📝 Generate docstrings

📜 Recent review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 4931f77 and b95a350.

📒 Files selected for processing (2)
  • apps/web/src/hooks/useSearch.ts
  • packages/search/src/database.ts
🧰 Additional context used
📓 Path-based instructions (4)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.{ts,tsx}: Use TypeScript with strict typing across entire codebase
Do not use any type unless absolutely necessary
Error handling with try/catch blocks and structured error types
Use Drizzle ORM for database operations
Use DisplayError class for user-friendly error messages and Result<T, E> type for explicit error handling (Rust-like pattern)

Files:

  • apps/web/src/hooks/useSearch.ts
  • packages/search/src/database.ts
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.{ts,tsx,js,jsx}: Write self-documenting code and avoid overuse of comments
Component naming: PascalCase for components, camelCase for functions/variables

Files:

  • apps/web/src/hooks/useSearch.ts
  • packages/search/src/database.ts
apps/web/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

apps/web/**/*.{ts,tsx}: Web app uses Tanstack Router for client-side routing
Orama is used for client-side WASM search engine with multi-language support (Arabic + English), indexed from local database on app initialization

Files:

  • apps/web/src/hooks/useSearch.ts
apps/{web,mobile}/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Use Lingui (v5) for internationalization in both web and mobile applications

Files:

  • apps/web/src/hooks/useSearch.ts
🧠 Learnings (5)
📓 Common learnings
Learnt from: CR
Repo: Shunseii/bahar PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-28T19:33:44.247Z
Learning: Applies to apps/web/**/*.{ts,tsx} : Orama is used for client-side WASM search engine with multi-language support (Arabic + English), indexed from local database on app initialization
📚 Learning: 2025-12-28T19:33:44.247Z
Learnt from: CR
Repo: Shunseii/bahar PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-28T19:33:44.247Z
Learning: Applies to apps/web/**/*.{ts,tsx} : Orama is used for client-side WASM search engine with multi-language support (Arabic + English), indexed from local database on app initialization

Applied to files:

  • apps/web/src/hooks/useSearch.ts
  • packages/search/src/database.ts
📚 Learning: 2025-11-27T06:48:31.996Z
Learnt from: Shunseii
Repo: Shunseii/bahar PR: 24
File: apps/web/src/hooks/db/index.ts:59-60
Timestamp: 2025-11-27T06:48:31.996Z
Learning: In the Bahar project, Orama methods (insert, update, remove, search) are synchronous by default since no async plugins are being used. Orama v3.0.0+ core methods are sync; they only become async when specific plugins like plugin-embeddings are added. Do not suggest awaiting Orama operations unless async plugins are confirmed to be in use.

Applied to files:

  • packages/search/src/database.ts
📚 Learning: 2025-11-27T06:48:57.365Z
Learnt from: Shunseii
Repo: Shunseii/bahar PR: 24
File: apps/web/src/hooks/db/index.ts:84-85
Timestamp: 2025-11-27T06:48:57.365Z
Learning: In the Orama search library, core operations like insert(), remove(), and update() are synchronous by default. They only become asynchronous when specific plugins (e.g., embedding generators) or Orama Cloud features are used. Don't suggest awaiting these operations unless async plugins are confirmed to be in use.

Applied to files:

  • packages/search/src/database.ts
📚 Learning: 2025-11-27T06:49:36.986Z
Learnt from: Shunseii
Repo: Shunseii/bahar PR: 24
File: apps/web/src/hooks/db/index.ts:114-115
Timestamp: 2025-11-27T06:49:36.986Z
Learning: In the bahar project, Orama database operations (insert, update, remove) are synchronous because only the plugin-qps plugin is used, which is a local scoring plugin. These operations should not be awaited.

Applied to files:

  • packages/search/src/database.ts
🧬 Code graph analysis (1)
packages/search/src/database.ts (3)
packages/search/src/schema.ts (1)
  • DictionaryDocument (66-82)
apps/web/src/lib/utils.ts (1)
  • stripArabicDiacritics (118-121)
packages/search/src/arabic.ts (1)
  • stripArabicDiacritics (9-11)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Cloudflare Pages
🔇 Additional comments (8)
packages/search/src/database.ts (5)

5-24: LGTM - Clean imports and type alias.

The imports are well-organized, and the SearchResults type alias improves readability by avoiding repetition of the verbose generic type throughout the file.


99-137: Well-structured search configuration constants.

The separation of exact and normalized properties with corresponding boost configuration is a clean approach for the two-pass search strategy. The boost values (10) consistently prioritize key fields.


161-171: Early return for empty term handles edge case correctly.

When no search term is provided, bypassing the two-pass logic and performing a simple search with pagination parameters is appropriate.


173-190: Tolerance scaling based on term length is well-designed.

The adaptive tolerance strategy (0 for short terms ≤4 chars, 1 for longer) for exact matching provides a good balance between precision and recall for Arabic text search.


207-220: Offset handling and count estimation have been improved.

The previous review comments have been addressed:

  • Offset is now correctly applied via fetchLimit = offset + limit and .slice(offset, offset + limit) for proper pagination
  • Count calculation now uses Math.max(exactResults.count, fuzzyResults.count) which provides a reasonable estimate

However, note that estimatedCount may still be imprecise in edge cases where there's significant overlap between exact and fuzzy results, but this is acceptable for UI purposes and avoids the complexity of computing exact deduplicated counts across all results.

apps/web/src/hooks/useSearch.ts (3)

1-7: Good refactor - Uses shared search module.

The previous review comments about duplicated two-pass logic have been addressed. The hook now correctly imports and delegates to searchDictionary from the shared package, maintaining a single source of truth.


34-46: Clean delegation to searchDictionary.

The simplified search callback properly delegates to the shared searchDictionary function with appropriate parameters. This eliminates the previously duplicated two-pass search logic.


172-178: LGTM - Proper hasMore calculation.

The condition hits.length >= searchResultsMetadata.count correctly determines when all results have been loaded.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (5)
packages/search/src/arabic.ts (1)

23-31: Good deprecation documentation, but consider removing the export.

The deprecation warning is clear and well-documented. However, since this function is marked as deprecated and "not used in search normalization," you might consider:

  1. Removing it from exports to prevent accidental usage
  2. Or keeping it unexported for potential internal use

If external consumers might still need it, the current approach is acceptable.

apps/web/src/lib/search/index.ts (2)

139-164: Consider extracting the morphology mapping to a reusable helper.

This morphology transformation logic is duplicated in rehydrateOramaDb (lines 253-278). Extracting it to a helper function would improve maintainability.

🔎 Proposed helper extraction
// Add at file level
const mapMorphologyForIndex = (
  morphology: z.infer<typeof MorphologySchema> | null | undefined
): DictionaryDocument["morphology"] => {
  if (!morphology) return undefined;
  
  return {
    ism: morphology.ism
      ? {
          singular: morphology.ism.singular,
          plurals: morphology.ism.plurals?.map((p) => p.word),
          singular_exact: morphology.ism.singular,
          plurals_exact: morphology.ism.plurals?.map((p) => p.word),
        }
      : undefined,
    verb: morphology.verb
      ? {
          past_tense: morphology.verb.past_tense,
          present_tense: morphology.verb.present_tense,
          masadir: morphology.verb.masadir?.map((m) => m.word),
          past_tense_exact: morphology.verb.past_tense,
          present_tense_exact: morphology.verb.present_tense,
          masadir_exact: morphology.verb.masadir?.map((m) => m.word),
        }
      : undefined,
  };
};

Then use: morphology: mapMorphologyForIndex(morphology),


229-240: Rehydration silently skips morphology parsing errors unlike hydration.

In hydrateOramaDb, morphology validation failures are logged to Sentry (lines 108-117), but rehydrateOramaDb doesn't log these errors. Consider adding similar logging for consistency.

apps/mobile/src/lib/search/index.ts (1)

85-141: Morphology mapping duplicated across mobile and web apps.

The morphology transformation logic (lines 118-141 and 282-305) is identical to the web app implementation. Consider extracting this to a shared utility in @bahar/search or @bahar/db-operations to avoid drift between platforms.

packages/search/src/tokenizer.ts (1)

48-58: EXACT_PROPS duplicated across multiple files.

This list is also defined in:

  • packages/search/src/database.ts (EXACT_PROPERTIES)
  • apps/web/src/hooks/useSearch.ts (EXACT_PROPERTIES)

Consider exporting this from a single location to ensure consistency.

🔎 Proposed consolidation

Export from packages/search/src/schema.ts:

export const EXACT_SEARCH_PROPERTIES = [
  "word_exact",
  "morphology.ism.singular_exact",
  "morphology.ism.plurals_exact",
  "morphology.verb.past_tense_exact",
  "morphology.verb.present_tense_exact",
  "morphology.verb.masadir_exact",
] as const;

export const NORMALIZED_SEARCH_PROPERTIES = [
  "word",
  "translation",
  "definition",
  "tags",
  "morphology.ism.plurals",
  "morphology.ism.singular",
  "morphology.verb.masadir",
  "morphology.verb.past_tense",
  "morphology.verb.present_tense",
] as const;

Then import where needed.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between d528e9c and 07f6e4c.

📒 Files selected for processing (7)
  • apps/mobile/src/lib/search/index.ts
  • apps/web/src/hooks/useSearch.ts
  • apps/web/src/lib/search/index.ts
  • packages/search/src/arabic.ts
  • packages/search/src/database.ts
  • packages/search/src/schema.ts
  • packages/search/src/tokenizer.ts
🧰 Additional context used
📓 Path-based instructions (5)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.{ts,tsx}: Use TypeScript with strict typing across entire codebase
Do not use any type unless absolutely necessary
Error handling with try/catch blocks and structured error types
Use Drizzle ORM for database operations
Use DisplayError class for user-friendly error messages and Result<T, E> type for explicit error handling (Rust-like pattern)

Files:

  • packages/search/src/tokenizer.ts
  • apps/mobile/src/lib/search/index.ts
  • packages/search/src/database.ts
  • packages/search/src/arabic.ts
  • apps/web/src/lib/search/index.ts
  • apps/web/src/hooks/useSearch.ts
  • packages/search/src/schema.ts
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.{ts,tsx,js,jsx}: Write self-documenting code and avoid overuse of comments
Component naming: PascalCase for components, camelCase for functions/variables

Files:

  • packages/search/src/tokenizer.ts
  • apps/mobile/src/lib/search/index.ts
  • packages/search/src/database.ts
  • packages/search/src/arabic.ts
  • apps/web/src/lib/search/index.ts
  • apps/web/src/hooks/useSearch.ts
  • packages/search/src/schema.ts
apps/mobile/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

apps/mobile/**/*.{ts,tsx}: Mobile app uses UniWind (Tailwind for React Native) with Tailwind CSS v4
Mobile app uses Expo with file-based routing (Expo Router)

Files:

  • apps/mobile/src/lib/search/index.ts
apps/{web,mobile}/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Use Lingui (v5) for internationalization in both web and mobile applications

Files:

  • apps/mobile/src/lib/search/index.ts
  • apps/web/src/lib/search/index.ts
  • apps/web/src/hooks/useSearch.ts
apps/web/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

apps/web/**/*.{ts,tsx}: Web app uses Tanstack Router for client-side routing
Orama is used for client-side WASM search engine with multi-language support (Arabic + English), indexed from local database on app initialization

Files:

  • apps/web/src/lib/search/index.ts
  • apps/web/src/hooks/useSearch.ts
🧠 Learnings (7)
📓 Common learnings
Learnt from: CR
Repo: Shunseii/bahar PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-28T19:33:44.247Z
Learning: Applies to apps/web/**/*.{ts,tsx} : Orama is used for client-side WASM search engine with multi-language support (Arabic + English), indexed from local database on app initialization
📚 Learning: 2025-11-27T06:51:53.688Z
Learnt from: Shunseii
Repo: Shunseii/bahar PR: 24
File: apps/web/src/lib/search/orama-tokenizer.ts:31-31
Timestamp: 2025-11-27T06:51:53.688Z
Learning: In Orama (search library), custom tokenizers must include a `normalizationCache: Map<string, string>` property as part of the required tokenizer interface, even if not explicitly used in the tokenize method implementation. This is part of Orama's documented API contract for custom tokenizers.

Applied to files:

  • packages/search/src/tokenizer.ts
  • packages/search/src/schema.ts
📚 Learning: 2025-12-28T19:33:44.247Z
Learnt from: CR
Repo: Shunseii/bahar PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-28T19:33:44.247Z
Learning: Applies to apps/web/**/*.{ts,tsx} : Orama is used for client-side WASM search engine with multi-language support (Arabic + English), indexed from local database on app initialization

Applied to files:

  • packages/search/src/tokenizer.ts
  • apps/mobile/src/lib/search/index.ts
  • packages/search/src/database.ts
  • apps/web/src/lib/search/index.ts
  • apps/web/src/hooks/useSearch.ts
  • packages/search/src/schema.ts
📚 Learning: 2025-11-27T06:48:31.996Z
Learnt from: Shunseii
Repo: Shunseii/bahar PR: 24
File: apps/web/src/hooks/db/index.ts:59-60
Timestamp: 2025-11-27T06:48:31.996Z
Learning: In the Bahar project, Orama methods (insert, update, remove, search) are synchronous by default since no async plugins are being used. Orama v3.0.0+ core methods are sync; they only become async when specific plugins like plugin-embeddings are added. Do not suggest awaiting Orama operations unless async plugins are confirmed to be in use.

Applied to files:

  • packages/search/src/database.ts
📚 Learning: 2025-11-27T06:48:57.365Z
Learnt from: Shunseii
Repo: Shunseii/bahar PR: 24
File: apps/web/src/hooks/db/index.ts:84-85
Timestamp: 2025-11-27T06:48:57.365Z
Learning: In the Orama search library, core operations like insert(), remove(), and update() are synchronous by default. They only become asynchronous when specific plugins (e.g., embedding generators) or Orama Cloud features are used. Don't suggest awaiting these operations unless async plugins are confirmed to be in use.

Applied to files:

  • packages/search/src/database.ts
📚 Learning: 2025-11-27T06:49:36.986Z
Learnt from: Shunseii
Repo: Shunseii/bahar PR: 24
File: apps/web/src/hooks/db/index.ts:114-115
Timestamp: 2025-11-27T06:49:36.986Z
Learning: In the bahar project, Orama database operations (insert, update, remove) are synchronous because only the plugin-qps plugin is used, which is a local scoring plugin. These operations should not be awaited.

Applied to files:

  • packages/search/src/database.ts
📚 Learning: 2025-12-25T22:07:11.870Z
Learnt from: Shunseii
Repo: Shunseii/bahar PR: 27
File: apps/api/src/db/schema/auth.ts:34-36
Timestamp: 2025-12-25T22:07:11.870Z
Learning: Files in apps/api/src/db/schema/auth.ts are auto-generated by better-auth and should not be manually modified as changes will be overwritten. Any schema issues should be tested and reported upstream to better-auth.

Applied to files:

  • packages/search/src/schema.ts
🧬 Code graph analysis (4)
packages/search/src/tokenizer.ts (1)
packages/search/src/arabic.ts (2)
  • stripArabicDiacritics (9-11)
  • normalizeArabicForSearch (40-42)
apps/mobile/src/lib/search/index.ts (2)
packages/db-operations/src/converters.ts (1)
  • safeJsonParse (31-52)
packages/drizzle-user-db-schemas/src/types.ts (1)
  • MorphologySchema (31-77)
apps/web/src/lib/search/index.ts (2)
packages/db-operations/src/converters.ts (1)
  • safeJsonParse (31-52)
packages/drizzle-user-db-schemas/src/types.ts (1)
  • MorphologySchema (31-77)
apps/web/src/hooks/useSearch.ts (3)
packages/search/src/schema.ts (1)
  • DictionaryDocument (66-82)
apps/mobile/src/lib/search/index.ts (1)
  • getOramaDb (37-42)
apps/web/src/lib/utils.ts (1)
  • stripArabicDiacritics (120-123)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Cloudflare Pages
🔇 Additional comments (9)
packages/search/src/arabic.ts (1)

33-42: Normalization simplification looks correct.

The removal of weak-letter normalization aligns with the two-pass search strategy. Preserving hamza variants and weak letters in exact fields enables better ranking of precise matches.

apps/web/src/lib/search/index.ts (1)

104-117: LGTM - Good conditional logging for morphology validation.

The check !morphologyResult.ok && entry.morphology correctly avoids logging when morphology is simply null/undefined (expected case).

packages/search/src/database.ts (2)

99-137: Well-structured search configuration constants.

The separation of exact and normalized properties with associated boost values is clean and maintainable.


154-162: Empty term search bypasses offset parameter.

When term is empty, the search correctly falls back to a simple search, but options?.properties is also ignored. This may be intentional for listing all entries, but worth noting for API consistency.

apps/mobile/src/lib/search/index.ts (1)

90-99: Inconsistent skip logic between rehydrate and hydrate.

In rehydrateOramaDb, entries are skipped if any of root, tags, antonyms, or examples fail parsing (lines 90-99). However, in hydrateOramaDb, the same validation includes morphologyResult in the error logging but not in the skip condition (lines 216-222). The skip logic is consistent, which is good.

packages/search/src/tokenizer.ts (1)

23-31: Good separation of exact tokenizer.

The exact tokenizer correctly disables stemming while still tokenizing Arabic text, enabling precise matching on *_exact fields.

packages/search/src/schema.ts (3)

11-39: Well-documented schema extension for exact matching.

The JSDoc clearly explains the purpose of _exact fields, and the schema structure logically separates normalized and exact variants.


41-60: IndexedMorphology interface is well-structured.

The flattened structure correctly mirrors the schema fields and provides proper typing for both normalized and exact variants.


66-82: DictionaryDocument correctly updated with new fields.

The addition of word_exact and updated morphology type aligns with the schema changes.

Comment thread apps/web/src/hooks/useSearch.ts Outdated
Comment thread apps/web/src/hooks/useSearch.ts Outdated
Comment thread apps/web/src/hooks/useSearch.ts Outdated
Comment thread packages/search/src/database.ts
Comment thread packages/search/src/tokenizer.ts
@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented Dec 30, 2025

Deploying bahar with  Cloudflare Pages  Cloudflare Pages

Latest commit: b95a350
Status: ✅  Deploy successful!
Preview URL: https://71ccc1d1.bahar-5xu.pages.dev
Branch Preview URL: https://refactor-two-pass-orama-sear.bahar-5xu.pages.dev

View logs

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
apps/web/src/hooks/useSearch.ts (2)

137-153: Missing searchQueryLanguage in dependency array.

The useEffect at line 137 uses searchQueryLanguage but it's not in the dependency array. This could cause stale language values when the search term's language changes:

🔎 Proposed fix
   useEffect(() => {
     // Don't search when offset is 0 since it
     // was already handled in the other useEffect
     if (offset === 0) return;

     const { hits } = search(
       {
         ...params,
         offset,
       },
       searchQueryLanguage
     );

     setHits((previousHits) =>
       previousHits ? [...previousHits, ...hits] : hits
     );
-  }, [offset, setHits, search]);
+  }, [offset, setHits, search, searchQueryLanguage]);

155-170: Missing searchQueryLanguage in dependency array.

Similar to the previous effect, this useEffect uses searchQueryLanguage but omits it from dependencies:

🔎 Proposed fix
   useEffect(() => {
     const { hits, ...metadata } = search(
       {
         ...params,
         offset: 0,
       },
       searchQueryLanguage
     );

     setOffset(0);
     setHits(hits);
     setSearchResultsMetadata(metadata);
     setHasMore(hits.length < metadata.count);
-  }, [paramsKey, setOffset, setHits, setSearchResultsMetadata, search]);
+  }, [paramsKey, setOffset, setHits, setSearchResultsMetadata, search, searchQueryLanguage]);
🧹 Nitpick comments (5)
apps/web/src/hooks/db/index.ts (1)

40-42: Consider updating suggested tags on edit as well.

The useEditDictionaryEntry hook doesn't update suggestedTagsAtom when editing a word with tags, while useAddDictionaryEntry does. If this inconsistency is intentional (only track tags from new entries), consider adding a brief comment. Otherwise, you may want to apply the same logic in the edit flow.

apps/web/src/lib/search/index.ts (1)

174-214: Consider extracting shared morphology transformation logic.

The morphology transformation logic (lines 189-214) is duplicated in toOramaDocument, hydrateOramaDb, and rehydrateOramaDb. While the input types differ (SelectDictionaryEntry vs parsed MorphologySchema), the transformation logic is identical once morphology is parsed.

Consider extracting a helper function that transforms the parsed morphology object:

🔎 Proposed helper
const transformMorphology = (morphology: z.infer<typeof MorphologySchema> | undefined) => {
  if (!morphology) return undefined;
  return {
    ism: morphology.ism
      ? {
          singular: morphology.ism.singular,
          plurals: morphology.ism.plurals?.map((p) => p.word),
          singular_exact: morphology.ism.singular,
          plurals_exact: morphology.ism.plurals?.map((p) => p.word),
        }
      : undefined,
    verb: morphology.verb
      ? {
          past_tense: morphology.verb.past_tense,
          present_tense: morphology.verb.present_tense,
          masadir: morphology.verb.masadir?.map((m) => m.word),
          past_tense_exact: morphology.verb.past_tense,
          present_tense_exact: morphology.verb.present_tense,
          masadir_exact: morphology.verb.masadir?.map((m) => m.word),
        }
      : undefined,
  };
};
apps/web/src/components/TagsCombobox.tsx (3)

268-277: Fragile timeout pattern in onBlur.

The 150ms setTimeout delay to allow dropdown clicks is a common workaround but can be unreliable on slower devices or under load. Consider using onMouseDown with preventDefault() on dropdown options instead, which prevents the blur event entirely.

🔎 Alternative approach

On the dropdown option elements, use onMouseDown instead of onClick:

onMouseDown={(e) => {
  e.preventDefault(); // Prevents input blur
  selectOption(optionValue);
}}

This eliminates the need for the setTimeout workaround.


309-312: Hardcoded "Searching..." string should be internationalized.

Per coding guidelines, Lingui is used for i18n. This hardcoded string should use <Trans> or t:

🔎 Proposed fix
+import { Trans } from "@lingui/react/macro";
 // ...
           {isSearching ? (
             <div className="py-6 text-center text-muted-foreground text-sm">
-              Searching...
+              <Trans>Searching...</Trans>
             </div>

370-372: Default "Add" text should also be internationalized.

The fallback Add "{trimmedInput}" text should use Lingui:

🔎 Proposed fix
                  {renderCreateOption ? (
                    renderCreateOption(trimmedInput)
                  ) : (
-                    <>Add "{trimmedInput}"</>
+                    <Trans>Add "{trimmedInput}"</Trans>
                  )}
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 07f6e4c and 0aab05f.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (10)
  • apps/web/src/atoms/suggested-tags.ts
  • apps/web/src/components/TagsCombobox.tsx
  • apps/web/src/components/features/dictionary/add/TagsFormSection.tsx
  • apps/web/src/hooks/db/index.ts
  • apps/web/src/hooks/useSearch.ts
  • apps/web/src/lib/search/index.ts
  • apps/web/src/lib/utils.ts
  • apps/web/src/routes/_authorized-layout/_app-layout/dictionary/edit/$wordId.tsx
  • biome.jsonc
  • packages/search/src/database.ts
💤 Files with no reviewable changes (1)
  • apps/web/src/lib/utils.ts
🧰 Additional context used
📓 Path-based instructions (6)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.{ts,tsx}: Use TypeScript with strict typing across entire codebase
Do not use any type unless absolutely necessary
Error handling with try/catch blocks and structured error types
Use Drizzle ORM for database operations
Use DisplayError class for user-friendly error messages and Result<T, E> type for explicit error handling (Rust-like pattern)

Files:

  • apps/web/src/atoms/suggested-tags.ts
  • apps/web/src/routes/_authorized-layout/_app-layout/dictionary/edit/$wordId.tsx
  • apps/web/src/hooks/useSearch.ts
  • apps/web/src/components/features/dictionary/add/TagsFormSection.tsx
  • apps/web/src/hooks/db/index.ts
  • apps/web/src/components/TagsCombobox.tsx
  • apps/web/src/lib/search/index.ts
  • packages/search/src/database.ts
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.{ts,tsx,js,jsx}: Write self-documenting code and avoid overuse of comments
Component naming: PascalCase for components, camelCase for functions/variables

Files:

  • apps/web/src/atoms/suggested-tags.ts
  • apps/web/src/routes/_authorized-layout/_app-layout/dictionary/edit/$wordId.tsx
  • apps/web/src/hooks/useSearch.ts
  • apps/web/src/components/features/dictionary/add/TagsFormSection.tsx
  • apps/web/src/hooks/db/index.ts
  • apps/web/src/components/TagsCombobox.tsx
  • apps/web/src/lib/search/index.ts
  • packages/search/src/database.ts
apps/web/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

apps/web/**/*.{ts,tsx}: Web app uses Tanstack Router for client-side routing
Orama is used for client-side WASM search engine with multi-language support (Arabic + English), indexed from local database on app initialization

Files:

  • apps/web/src/atoms/suggested-tags.ts
  • apps/web/src/routes/_authorized-layout/_app-layout/dictionary/edit/$wordId.tsx
  • apps/web/src/hooks/useSearch.ts
  • apps/web/src/components/features/dictionary/add/TagsFormSection.tsx
  • apps/web/src/hooks/db/index.ts
  • apps/web/src/components/TagsCombobox.tsx
  • apps/web/src/lib/search/index.ts
apps/{web,mobile}/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Use Lingui (v5) for internationalization in both web and mobile applications

Files:

  • apps/web/src/atoms/suggested-tags.ts
  • apps/web/src/routes/_authorized-layout/_app-layout/dictionary/edit/$wordId.tsx
  • apps/web/src/hooks/useSearch.ts
  • apps/web/src/components/features/dictionary/add/TagsFormSection.tsx
  • apps/web/src/hooks/db/index.ts
  • apps/web/src/components/TagsCombobox.tsx
  • apps/web/src/lib/search/index.ts
**/*.{tsx,jsx}

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.{tsx,jsx}: React components use functional style with hooks
Prefer using jotai atoms over React Context
State management: use Tanstack Query for async state and Jotai for atomic state
Use Tanstack Query for server state management

Files:

  • apps/web/src/routes/_authorized-layout/_app-layout/dictionary/edit/$wordId.tsx
  • apps/web/src/components/features/dictionary/add/TagsFormSection.tsx
  • apps/web/src/components/TagsCombobox.tsx
apps/web/**/*.{tsx,jsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Web app uses Shadcn/UI components and Tailwind CSS v4 for styling, use the cn() utility function for combining and conditionally applying Tailwind classes

Files:

  • apps/web/src/routes/_authorized-layout/_app-layout/dictionary/edit/$wordId.tsx
  • apps/web/src/components/features/dictionary/add/TagsFormSection.tsx
  • apps/web/src/components/TagsCombobox.tsx
🧠 Learnings (11)
📓 Common learnings
Learnt from: CR
Repo: Shunseii/bahar PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-28T19:33:44.247Z
Learning: Applies to apps/web/**/*.{ts,tsx} : Orama is used for client-side WASM search engine with multi-language support (Arabic + English), indexed from local database on app initialization
📚 Learning: 2025-12-28T19:33:44.247Z
Learnt from: CR
Repo: Shunseii/bahar PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-28T19:33:44.247Z
Learning: Applies to **/*.{tsx,jsx} : Prefer using jotai atoms over React Context

Applied to files:

  • apps/web/src/atoms/suggested-tags.ts
  • apps/web/src/hooks/db/index.ts
📚 Learning: 2025-12-28T19:33:44.247Z
Learnt from: CR
Repo: Shunseii/bahar PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-28T19:33:44.247Z
Learning: Applies to apps/web/**/*.{ts,tsx} : Orama is used for client-side WASM search engine with multi-language support (Arabic + English), indexed from local database on app initialization

Applied to files:

  • apps/web/src/hooks/useSearch.ts
  • apps/web/src/hooks/db/index.ts
  • apps/web/src/lib/search/index.ts
  • packages/search/src/database.ts
📚 Learning: 2025-12-28T19:33:44.247Z
Learnt from: CR
Repo: Shunseii/bahar PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-28T19:33:44.247Z
Learning: Applies to **/*.{tsx,jsx} : State management: use Tanstack Query for async state and Jotai for atomic state

Applied to files:

  • apps/web/src/hooks/db/index.ts
📚 Learning: 2025-12-28T19:33:44.247Z
Learnt from: CR
Repo: Shunseii/bahar PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-28T19:33:44.247Z
Learning: Applies to **/*.{tsx,jsx} : Use Tanstack Query for server state management

Applied to files:

  • apps/web/src/hooks/db/index.ts
📚 Learning: 2025-11-27T06:48:57.365Z
Learnt from: Shunseii
Repo: Shunseii/bahar PR: 24
File: apps/web/src/hooks/db/index.ts:84-85
Timestamp: 2025-11-27T06:48:57.365Z
Learning: In the Orama search library, core operations like insert(), remove(), and update() are synchronous by default. They only become asynchronous when specific plugins (e.g., embedding generators) or Orama Cloud features are used. Don't suggest awaiting these operations unless async plugins are confirmed to be in use.

Applied to files:

  • apps/web/src/hooks/db/index.ts
  • packages/search/src/database.ts
📚 Learning: 2025-12-28T19:33:44.247Z
Learnt from: CR
Repo: Shunseii/bahar PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-28T19:33:44.247Z
Learning: Applies to **/*.{tsx,jsx} : React components use functional style with hooks

Applied to files:

  • apps/web/src/components/TagsCombobox.tsx
📚 Learning: 2025-11-27T06:48:31.996Z
Learnt from: Shunseii
Repo: Shunseii/bahar PR: 24
File: apps/web/src/hooks/db/index.ts:59-60
Timestamp: 2025-11-27T06:48:31.996Z
Learning: In the Bahar project, Orama methods (insert, update, remove, search) are synchronous by default since no async plugins are being used. Orama v3.0.0+ core methods are sync; they only become async when specific plugins like plugin-embeddings are added. Do not suggest awaiting Orama operations unless async plugins are confirmed to be in use.

Applied to files:

  • apps/web/src/lib/search/index.ts
  • packages/search/src/database.ts
📚 Learning: 2025-12-28T19:33:44.247Z
Learnt from: CR
Repo: Shunseii/bahar PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-28T19:33:44.247Z
Learning: Applies to **/*.{ts,tsx} : Use TypeScript with strict typing across entire codebase

Applied to files:

  • biome.jsonc
📚 Learning: 2025-12-28T19:33:44.247Z
Learnt from: CR
Repo: Shunseii/bahar PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-28T19:33:44.247Z
Learning: Applies to **/*.{ts,tsx} : Do not use `any` type unless absolutely necessary

Applied to files:

  • biome.jsonc
📚 Learning: 2025-11-27T06:49:36.986Z
Learnt from: Shunseii
Repo: Shunseii/bahar PR: 24
File: apps/web/src/hooks/db/index.ts:114-115
Timestamp: 2025-11-27T06:49:36.986Z
Learning: In the bahar project, Orama database operations (insert, update, remove) are synchronous because only the plugin-qps plugin is used, which is a local scoring plugin. These operations should not be awaited.

Applied to files:

  • packages/search/src/database.ts
🧬 Code graph analysis (3)
apps/web/src/hooks/useSearch.ts (4)
packages/search/src/schema.ts (1)
  • DictionaryDocument (66-82)
packages/search/src/database.ts (2)
  • SearchLanguage (139-139)
  • searchDictionary (147-215)
apps/web/src/lib/search/index.ts (1)
  • getOramaDb (23-23)
apps/mobile/src/lib/search/index.ts (1)
  • getOramaDb (37-42)
apps/web/src/hooks/db/index.ts (3)
apps/web/src/hooks/useSearch.ts (1)
  • useSearch (26-101)
apps/web/src/atoms/suggested-tags.ts (1)
  • suggestedTagsAtom (3-16)
apps/web/src/lib/search/index.ts (2)
  • getOramaDb (23-23)
  • toOramaDocument (29-72)
apps/web/src/lib/search/index.ts (4)
packages/drizzle-user-db-schemas/src/index.ts (1)
  • SelectDictionaryEntry (12-12)
packages/search/src/schema.ts (1)
  • DictionaryDocument (66-82)
packages/db-operations/src/converters.ts (1)
  • safeJsonParse (31-52)
packages/drizzle-user-db-schemas/src/types.ts (1)
  • MorphologySchema (31-77)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Cloudflare Pages
🔇 Additional comments (8)
apps/web/src/hooks/db/index.ts (1)

66-67: LGTM!

The transformation to toOramaDocument ensures consistency with hydration flows, and the synchronous Orama insert call is correct per the project's configuration (no async plugins).

apps/web/src/components/features/dictionary/add/TagsFormSection.tsx (2)

19-38: LGTM!

Clean implementation using Jotai for state management (per coding guidelines), proper filtering logic for available suggestions, and defensive duplicate checking in addTag. The useFieldArray with replace is appropriate for this use case.


53-62: Good integration with TagsCombobox.

The TagsCombobox integration is well-structured with proper value extraction, i18n support via Lingui, and clean data transformation in onValueChange.

apps/web/src/lib/search/index.ts (1)

25-72: LGTM! Well-documented transformation function.

The toOramaDocument function correctly handles the morphology transformation (DB {word: string}[] → Orama string[]) and populates _exact variants for the two-pass search strategy.

packages/search/src/database.ts (2)

157-170: LGTM on empty term handling.

The early return for empty terms correctly bypasses the two-pass logic and applies the offset, which is appropriate for browsing/listing scenarios.


210-214: Good fix on count calculation.

Using mergedHits.length for count addresses the previous review comment about potential inaccuracy. This now correctly reflects the actual number of results returned.

apps/web/src/components/TagsCombobox.tsx (1)

77-105: LGTM on async data fetching.

Good use of cancellation flag to prevent state updates on unmounted components, and startTransition for non-urgent UI updates.

apps/web/src/hooks/useSearch.ts (1)

34-46: LGTM! Clean delegation to searchDictionary.

The refactor correctly delegates to searchDictionary from the shared package, eliminating the previously duplicated two-pass search logic. This addresses the past review comments about code duplication.

Comment thread apps/web/src/atoms/suggested-tags.ts
Comment thread apps/web/src/routes/_authorized-layout/_app-layout/dictionary/edit/$wordId.tsx Outdated
Comment thread biome.jsonc
Comment thread packages/search/src/database.ts Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
apps/web/src/hooks/useSearch.ts (2)

137-153: Missing dependencies in useEffect.

The useEffect at lines 137-153 references params and searchQueryLanguage but only includes [offset, setHits, search] in its dependency array. This could cause stale closures if params or searchQueryLanguage change while offset remains non-zero.

🔎 Suggested fix
   useEffect(() => {
     // Don't search when offset is 0 since it
     // was already handled in the other useEffect
     if (offset === 0) return;

     const { hits } = search(
       {
         ...params,
         offset,
       },
       searchQueryLanguage
     );

     setHits((previousHits) =>
       previousHits ? [...previousHits, ...hits] : hits
     );
-  }, [offset, setHits, search]);
+  }, [offset, setHits, search, paramsKey, searchQueryLanguage]);

Note: Using paramsKey instead of params avoids unnecessary re-renders since paramsKey is already a stable string representation.


157-170: Missing searchQueryLanguage in dependency array.

The effect uses searchQueryLanguage but it's not listed in dependencies. If the language changes (e.g., user switches from Arabic to English input), the search won't re-execute.

🔎 Suggested fix
-  }, [paramsKey, setOffset, setHits, setSearchResultsMetadata, search]);
+  }, [paramsKey, searchQueryLanguage, setOffset, setHits, setSearchResultsMetadata, search]);
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 4931f77 and b95a350.

📒 Files selected for processing (2)
  • apps/web/src/hooks/useSearch.ts
  • packages/search/src/database.ts
🧰 Additional context used
📓 Path-based instructions (4)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.{ts,tsx}: Use TypeScript with strict typing across entire codebase
Do not use any type unless absolutely necessary
Error handling with try/catch blocks and structured error types
Use Drizzle ORM for database operations
Use DisplayError class for user-friendly error messages and Result<T, E> type for explicit error handling (Rust-like pattern)

Files:

  • apps/web/src/hooks/useSearch.ts
  • packages/search/src/database.ts
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.{ts,tsx,js,jsx}: Write self-documenting code and avoid overuse of comments
Component naming: PascalCase for components, camelCase for functions/variables

Files:

  • apps/web/src/hooks/useSearch.ts
  • packages/search/src/database.ts
apps/web/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

apps/web/**/*.{ts,tsx}: Web app uses Tanstack Router for client-side routing
Orama is used for client-side WASM search engine with multi-language support (Arabic + English), indexed from local database on app initialization

Files:

  • apps/web/src/hooks/useSearch.ts
apps/{web,mobile}/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Use Lingui (v5) for internationalization in both web and mobile applications

Files:

  • apps/web/src/hooks/useSearch.ts
🧠 Learnings (5)
📓 Common learnings
Learnt from: CR
Repo: Shunseii/bahar PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-28T19:33:44.247Z
Learning: Applies to apps/web/**/*.{ts,tsx} : Orama is used for client-side WASM search engine with multi-language support (Arabic + English), indexed from local database on app initialization
📚 Learning: 2025-12-28T19:33:44.247Z
Learnt from: CR
Repo: Shunseii/bahar PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-28T19:33:44.247Z
Learning: Applies to apps/web/**/*.{ts,tsx} : Orama is used for client-side WASM search engine with multi-language support (Arabic + English), indexed from local database on app initialization

Applied to files:

  • apps/web/src/hooks/useSearch.ts
  • packages/search/src/database.ts
📚 Learning: 2025-11-27T06:48:31.996Z
Learnt from: Shunseii
Repo: Shunseii/bahar PR: 24
File: apps/web/src/hooks/db/index.ts:59-60
Timestamp: 2025-11-27T06:48:31.996Z
Learning: In the Bahar project, Orama methods (insert, update, remove, search) are synchronous by default since no async plugins are being used. Orama v3.0.0+ core methods are sync; they only become async when specific plugins like plugin-embeddings are added. Do not suggest awaiting Orama operations unless async plugins are confirmed to be in use.

Applied to files:

  • packages/search/src/database.ts
📚 Learning: 2025-11-27T06:48:57.365Z
Learnt from: Shunseii
Repo: Shunseii/bahar PR: 24
File: apps/web/src/hooks/db/index.ts:84-85
Timestamp: 2025-11-27T06:48:57.365Z
Learning: In the Orama search library, core operations like insert(), remove(), and update() are synchronous by default. They only become asynchronous when specific plugins (e.g., embedding generators) or Orama Cloud features are used. Don't suggest awaiting these operations unless async plugins are confirmed to be in use.

Applied to files:

  • packages/search/src/database.ts
📚 Learning: 2025-11-27T06:49:36.986Z
Learnt from: Shunseii
Repo: Shunseii/bahar PR: 24
File: apps/web/src/hooks/db/index.ts:114-115
Timestamp: 2025-11-27T06:49:36.986Z
Learning: In the bahar project, Orama database operations (insert, update, remove) are synchronous because only the plugin-qps plugin is used, which is a local scoring plugin. These operations should not be awaited.

Applied to files:

  • packages/search/src/database.ts
🧬 Code graph analysis (1)
packages/search/src/database.ts (3)
packages/search/src/schema.ts (1)
  • DictionaryDocument (66-82)
apps/web/src/lib/utils.ts (1)
  • stripArabicDiacritics (118-121)
packages/search/src/arabic.ts (1)
  • stripArabicDiacritics (9-11)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Cloudflare Pages
🔇 Additional comments (8)
packages/search/src/database.ts (5)

5-24: LGTM - Clean imports and type alias.

The imports are well-organized, and the SearchResults type alias improves readability by avoiding repetition of the verbose generic type throughout the file.


99-137: Well-structured search configuration constants.

The separation of exact and normalized properties with corresponding boost configuration is a clean approach for the two-pass search strategy. The boost values (10) consistently prioritize key fields.


161-171: Early return for empty term handles edge case correctly.

When no search term is provided, bypassing the two-pass logic and performing a simple search with pagination parameters is appropriate.


173-190: Tolerance scaling based on term length is well-designed.

The adaptive tolerance strategy (0 for short terms ≤4 chars, 1 for longer) for exact matching provides a good balance between precision and recall for Arabic text search.


207-220: Offset handling and count estimation have been improved.

The previous review comments have been addressed:

  • Offset is now correctly applied via fetchLimit = offset + limit and .slice(offset, offset + limit) for proper pagination
  • Count calculation now uses Math.max(exactResults.count, fuzzyResults.count) which provides a reasonable estimate

However, note that estimatedCount may still be imprecise in edge cases where there's significant overlap between exact and fuzzy results, but this is acceptable for UI purposes and avoids the complexity of computing exact deduplicated counts across all results.

apps/web/src/hooks/useSearch.ts (3)

1-7: Good refactor - Uses shared search module.

The previous review comments about duplicated two-pass logic have been addressed. The hook now correctly imports and delegates to searchDictionary from the shared package, maintaining a single source of truth.


34-46: Clean delegation to searchDictionary.

The simplified search callback properly delegates to the shared searchDictionary function with appropriate parameters. This eliminates the previously duplicated two-pass search logic.


172-178: LGTM - Proper hasMore calculation.

The condition hits.length >= searchResultsMetadata.count correctly determines when all results have been loaded.

@Shunseii Shunseii merged commit 9fdb4dd into main Dec 30, 2025
2 checks passed
@Shunseii Shunseii deleted the refactor/two-pass-orama-search branch December 30, 2025 02:12
This was referenced Apr 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant