Skip to content

Conversation

@amotl
Copy link
Member

@amotl amotl commented Oct 21, 2025

@amotl amotl added the pitch A feature or request about anything, content and layout. label Oct 21, 2025
@coderabbitai
Copy link

coderabbitai bot commented Oct 21, 2025

Walkthrough

Adds a new "effective-search" FTS guide and updates the FTS index page layout/navigation and cards; also makes small edits to the explain docs (rubric/tag additions and guidance on reporting flaws). All changes are documentation only.

Changes

Cohort / File(s) Summary
New FTS guide
docs/feature/search/fts/effective-search.md
Adds a new documentation guide covering indexing text for effective search and accurate analysis: CrateDB analyzers (default/similar/exact), tokenizers, token/character filters, character folding, lemmatization, spelling-correction filters (Lucene SpellChecker), processing pipeline examples, tokenizer/filter behavior, and high-level explanations. No code changes.
FTS index & navigation
docs/feature/search/fts/index.md
Restructures the FTS index page: renames rubric sections (Guides → Tutorials, Articles → Explanations), replaces grid/info-card entries with card-style links to the new guide, updates toctree entries and tag groupings, and adjusts wording around product and analyzer descriptions.
Explain docs tweaks
docs/explain/index.md
Adds a rubric block labeled 2018, adds a reference tag to effective-fulltext-search, and expands guidance on reporting flaws (instructions referencing the tool flyout and "Suggest improvement").

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

  • Pay attention to cross-links and toctree entries in docs/feature/search/fts/index.md and the new guide to ensure nav consistency.
  • Verify the new guide's terminology and examples align with existing FTS docs.

Possibly related PRs

Suggested labels

guidance

Suggested reviewers

  • surister
  • kneth
  • bmunkholm

Poem

🐰 I hopped through pages, tidy and keen,
I planted a guide where search is seen,
Cards reshuffled, tags all in tune,
A nibble of clarity under the moon. 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title Check ✅ Passed The pull request title "Search: Indexing Text for Both Effective Search and Accurate Analysis" directly aligns with the main changes in the changeset. The PR adds a new documentation file (effective-search.md) that details indexing text for effective search and accurate analysis using CrateDB, and updates related navigation and index files to incorporate this new content. The title is concise, clear, and fully captures the primary objective without being vague or misleading.
Description Check ✅ Passed The pull request description is directly related to the changeset and provides meaningful context about the additions. It correctly identifies that the PR adds content referencing the article "Indexing Text for Both Effective Search and Accurate Analysis" by David Norton, includes proper attribution with links to the author's profiles and an archived copy of the article, and provides a preview URL for validation. The description is not vague or off-topic; it clearly communicates the purpose and scope of the changes.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch explain-effective-search

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between be80745 and 0370884.

📒 Files selected for processing (3)
  • docs/explain/index.md (1 hunks)
  • docs/feature/search/fts/effective-search.md (1 hunks)
  • docs/feature/search/fts/index.md (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • docs/explain/index.md
🧰 Additional context used
🪛 LanguageTool
docs/feature/search/fts/effective-search.md

[grammar] ~83-~83: Ensure spelling is correct
Context: ...Indexing If a client was to search for "wlking to work", they would probably hope to g...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


[grammar] ~84-~84: Ensure spelling is correct
Context: ...back like: "I walked to work", "I enjoy walkng to work", and "I walk to work every day...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


[style] ~88-~88: Often, this adverbial phrase is redundant. Consider using an alternative.
Context: ...ts without other negative consequences. First of all, “walking” is spelled wrong. Second, di...

(FIRST_OF_ALL)


[style] ~159-~159: To elevate your writing, try using an alternative expression here.
Context: ...nd that the actual content of the index does not matter as long as the search results are accur...

(MATTERS_RELEVANT)


[style] ~233-~233: Consider using a different adverb to strengthen your wording.
Context: ...ords (less than 4 characters) which are completely ignored by Lucene. Our spell correctio...

(COMPLETELY_ENTIRELY)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build docs
🔇 Additional comments (5)
docs/feature/search/fts/effective-search.md (2)

1-14: Article metadata and structure look good.

The article is properly attributed to David Norton from Qualtrics, includes publication date and read time, and presents a well-organized structure with clear sections. The archive link (line 266) ensures long-term accessibility of the original source.


83-91: Intentional misspellings in examples are correctly used.

Lines 83-84 contain "wlking" and "walkng" as examples of misspelled search terms—these are intentional and serve to illustrate the problem the article addresses. No correction needed.

docs/feature/search/fts/index.md (3)

277-277: Rubric naming improvements enhance documentation clarity.

The changes from "Guides" → "Tutorials" (line 277) and "Articles" → "Explanations" (line 301) follow standard documentation taxonomy and make the content organization more explicit and intuitive.

Also applies to: 301-301


341-360: New card section is well-structured and properly integrated.

The card block follows proper RST syntax with metadata, description, footer content (after +++), and tags. The link reference to effective-fulltext-search correctly points to the label defined in effective-search.md (line 1). Tags appropriately categorize the content as Introduction-level, covering Analyzer, Tokenizer, and Plugin topics.

Please verify that the cross-reference label effective-fulltext-search is correctly resolved by running a documentation build or link checker to ensure the hyperlink functions as intended.


370-370: Navigation entry properly added.

The new effective-search entry in the toctree (line 370) ensures the new page is included in the FTS guide navigation structure and will be discoverable in the table of contents.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@amotl amotl changed the title Explain effective search Explain: Indexing Text for Both Effective Search and Accurate Analysis Oct 21, 2025
@amotl amotl changed the base branch from main to explain October 21, 2025 22:07
coderabbitai[bot]

This comment was marked as resolved.

@amotl amotl force-pushed the explain-effective-search branch from cdf7c17 to 4e8a1e1 Compare October 21, 2025 22:11
@amotl amotl changed the title Explain: Indexing Text for Both Effective Search and Accurate Analysis Search: Indexing Text for Both Effective Search and Accurate Analysis Oct 21, 2025
@amotl amotl force-pushed the explain-effective-search branch 2 times, most recently from 204f4fb to 395b467 Compare October 21, 2025 23:02
coderabbitai[bot]

This comment was marked as resolved.

@amotl amotl force-pushed the explain-effective-search branch from 395b467 to 7ef86bd Compare October 21, 2025 23:35
@amotl amotl added cross linking Linking to different locations of the documentation. guidance Matters of layout, shape, and structure. labels Oct 21, 2025
coderabbitai[bot]

This comment was marked as resolved.

@amotl amotl added reorganize Moving content around, inside and between other systems. and removed pitch A feature or request about anything, content and layout. cross linking Linking to different locations of the documentation. guidance Matters of layout, shape, and structure. labels Oct 22, 2025
@amotl amotl requested review from matriv and seut October 22, 2025 01:36
@amotl amotl force-pushed the explain-effective-search branch from 7ef86bd to be80745 Compare October 22, 2025 01:39
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
docs/feature/search/fts/effective-search.md (1)

88-88: Optional: Minor style refinements.

Static analysis suggests a few stylistic improvements (lines 88, 159, 233), but these are preferences rather than issues. The current phrasing is natural and idiomatic. If you wish to polish: consider alternatives to "first of all" for variety, and review whether "completely" and "as long as" could be replaced with more concise alternatives. These are entirely optional in a chill review.

Also applies to: 159-159, 233-233

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7ef86bd and be80745.

📒 Files selected for processing (3)
  • docs/explain/index.md (1 hunks)
  • docs/feature/search/fts/effective-search.md (1 hunks)
  • docs/feature/search/fts/index.md (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • docs/explain/index.md
🧰 Additional context used
🪛 LanguageTool
docs/feature/search/fts/effective-search.md

[grammar] ~83-~83: Ensure spelling is correct
Context: ...Indexing If a client was to search for "wlking to work", they would probably hope to g...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


[grammar] ~84-~84: Ensure spelling is correct
Context: ...back like: "I walked to work", "I enjoy walkng to work", and "I walk to work every day...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


[style] ~88-~88: Often, this adverbial phrase is redundant. Consider using an alternative.
Context: ...ts without other negative consequences. First of all, “walking” is spelled wrong. Second, di...

(FIRST_OF_ALL)


[style] ~159-~159: To elevate your writing, try using an alternative expression here.
Context: ...nd that the actual content of the index does not matter as long as the search results are accur...

(MATTERS_RELEVANT)


[style] ~233-~233: Consider using a different adverb to strengthen your wording.
Context: ...ords (less than 4 characters) which are completely ignored by Lucene. Our spell correctio...

(COMPLETELY_ENTIRELY)

🔇 Additional comments (4)
docs/feature/search/fts/effective-search.md (2)

1-14: Excellent article header, metadata, and archival reference.

The article-info frontmatter is properly structured and the archive link to the original Qualtrics engineering blog article is correctly formatted with appropriate versioning.

Also applies to: 262-267


33-113: Strong technical depth and clear pedagogical structure.

The content progresses logically from business rationale (Why CrateDB?) through analyzer fundamentals to implementation techniques (character folding, lemmatization, spelling correction). The lemmatization comparison table and spell correction pseudocode effectively communicate complex concepts with concrete examples (e.g., Unicode apostrophes, German character folding rules, Morphy vs. stemmer accuracy).

Also applies to: 150-248

docs/feature/search/fts/index.md (2)

277-277: Semantic section renaming improves taxonomy consistency.

The updates from "Guides" → "Tutorials" and "Articles" → "Explanations" align with the broader documentation structure (as referenced in the PR context for docs/explain/index.md). This creates clearer semantic distinction: Tutorials are procedural/hands-on, Explanations are conceptual/deep-dive.

Also applies to: 301-301


341-360: New card entry is well-integrated with correct cross-references.

The card title, description, and link target correctly reference the new effective-search.md article. The reference label "effective-fulltext-search" at line 342 matches the file header label (verified at effective-search.md:1), and the toctree entry at line 370 correctly resolves to docs/feature/search/fts/effective-search.md. Tag assignments (Introduction, Analyzer, Tokenizer, Plugin) accurately reflect article content.

Copy link
Contributor

@matriv matriv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thx, not much here, since it's from an external author and must be taken as is.

Base automatically changed from explain to main October 24, 2025 18:50
@amotl amotl force-pushed the explain-effective-search branch from be80745 to 0370884 Compare October 27, 2025 09:55
@amotl amotl merged commit 4f5f615 into main Oct 27, 2025
3 checks passed
@amotl amotl deleted the explain-effective-search branch October 27, 2025 10:54
@bmunkholm
Copy link
Contributor

@moll Is there anything technical in the content that isn't already mentioned in the docs?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

reorganize Moving content around, inside and between other systems.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants