Skip to content

Conversation

@amotl
Copy link
Member

@amotl amotl commented Sep 30, 2025

About

Thanks for making a start to improve the sharding guidelines, @WalBeh. We just added a few copy-edits.

Preview

References

Review

Please also add your comments and suggestions when applicable. 🙏

/cc @karynzv, @hammerhead, @WalBeh, @surister, @kneth

@amotl amotl requested review from WalBeh and seut September 30, 2025 22:16
@coderabbitai
Copy link

coderabbitai bot commented Sep 30, 2025

Walkthrough

Documentation updates reorganize and reword sharding/partitioning and sharding performance guidance, adjust shard-size and shard-count recommendations in examples, retitle one admin page, and replace two external Locust CLI links with internal anchors. No code or public API changes.

Changes

Cohort / File(s) Summary of changes
Admin: sharding & partitioning
docs/admin/sharding-partitioning.md
Retitled to "Sharding and Partitioning 101"; removed an over-sharding admonition; added a benchmark-driven "Strategy" section and caution blocks; updated shard-size guidance/formatting and shard-count discussion (examples and calculations revised, time-series/example updated to use 6 shards, adjusted CREATE TABLE / PARTITION / PARTITIONING examples); added external guidance references.
Performance: scaling wording
docs/performance/scaling.md
Editorial clarifications: refined shard terminology and punctuation; removed an explicit inline note about replicas generating additional shards; replaced a named guide link with a general sharding guide reference; minor readability tweaks.
Performance: sharding guide (restructured)
docs/performance/sharding.md
Major restructure and rephrasing: normalized heading case; added cross-references; replaced prior sections with "Sizing considerations" and split content into sizing-focused subsections (shard size vs number, shard-per-CPU, 1000-shards/node limit, partitions, replicas, segments); added notes/caution blocks and updated ingestion guidance to emphasize benchmarking.
Feature: cluster synopsis
docs/feature/cluster/index.md
Formatting adjustment to shard-size range presentation (e.g., "5-50 GB" spacing/formatting); no semantic control-flow changes.
Integrations: Locust tutorial links
docs/integrate/locust/tutorial.md
Replaced two external CrateDB CLI hyperlinks with internal project anchors (project:#cli) for CLI reference links.

Sequence Diagram(s)

(No sequence diagrams: changes are documentation-only and do not introduce or modify runtime control flow.)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20–30 minutes

Possibly related issues

Possibly related PRs

Suggested labels

​sanding-500, refactoring, guidance

Suggested reviewers

  • karynzv
  • hammerhead
  • surister
  • kneth

Poem

I nibble docs beneath the moonlit log,
I shuffle shards like leaves in fog.
Benchmarks whisper how many to keep,
Six snug burrows where data sleeps.
Hop, tidy guides — a carrot to reap! 🥕

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title Check ✅ Passed The title succinctly captures the core change of the PR, which is enhancing the sharding strategy guidance and providing updated recommendations, and it avoids unnecessary detail or generic wording, making it clear and contextually accurate.
Description Check ✅ Passed The description clearly relates to the PR’s goal of improving the sharding guidelines, references relevant context and preview links, and remains on-topic without introducing unrelated content.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch sharding-advices

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

:::

## Optimising for query performance
### Avoid imbalances
Copy link
Member Author

@amotl amotl Sep 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About detecting imbalances, let's quickly introduce and refer to the XMover utility at this spot as soon as it is released.

Copy link
Member Author

@amotl amotl Oct 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please review the patches enumerated here, so we can release the XMover utility and mention it here. It is a perfect fit for extending the document to guide readers into applying excellent tooling.

Copy link
Member Author

@amotl amotl Oct 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amotl amotl marked this pull request as ready for review September 30, 2025 22:32
coderabbitai[bot]

This comment was marked as resolved.

coderabbitai[bot]

This comment was marked as resolved.

coderabbitai[bot]

This comment was marked as duplicate.

@amotl amotl requested a review from seut October 1, 2025 21:55
@amotl amotl requested review from matriv and removed request for WalBeh October 3, 2025 13:42
Copy link
Member

@seut seut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good now, thanks!

Copy link
Contributor

@matriv matriv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your effort here @amotl

Additionally I believe we could mention the increase/decrease number of shards after table creation 1, maybe with a special mention to 2, since if this value is not set correctly it might prevent users from correctly increasing their shards.

Edit: Diverted to GH-376.

Footnotes

  1. https://cratedb.com/docs/crate/reference/en/latest/general/ddl/alter-table.html

  2. https://cratedb.com/docs/crate/reference/en/latest/sql/statements/create-table.html#sql-create-table-number-of-routing-shards

@amotl amotl force-pushed the sharding-advices branch from 95dba4d to d4ca40a Compare October 7, 2025 09:29
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 95dba4d and d4ca40a.

📒 Files selected for processing (4)
  • docs/admin/sharding-partitioning.md (4 hunks)
  • docs/feature/cluster/index.md (1 hunks)
  • docs/performance/scaling.md (1 hunks)
  • docs/performance/sharding.md (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • docs/performance/scaling.md
  • docs/feature/cluster/index.md
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build docs

Comment on lines +114 to +119
### Segments

The number of segments within a shard affects query performance because more
segments have to be visited.

Copy link
Member Author

@amotl amotl Oct 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@matriv, @seut: This section is a bit flat. Do you have any suggestions to unflatten it slightly, to possibly provide better insights and guidance?

Copy link
Member Author

@amotl amotl Oct 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

coderabbitai[bot]

This comment was marked as resolved.

@amotl
Copy link
Member Author

amotl commented Oct 7, 2025

Hi. We created those tickets to track backlog items coming from this patch and comments on it. Thanks for your valuable feedback across the board. We hope you are fine to tackle them on subsequent iterations.

Copy link
Contributor

@matriv matriv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Please let @seut to have another go to check the new changes.

@matriv matriv requested a review from seut October 7, 2025 10:29
Copy link
Member

@seut seut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! just a note related to the minimum shard size, slipped through on earlier reviews, sry.

coderabbitai[bot]

This comment was marked as resolved.

@amotl amotl force-pushed the sharding-advices branch from 40782c2 to 1d73a79 Compare October 7, 2025 13:02
coderabbitai[bot]

This comment was marked as resolved.

@amotl amotl merged commit adfe31e into main Oct 7, 2025
3 checks passed
@amotl amotl deleted the sharding-advices branch October 7, 2025 17:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants