crate
diff --git a/‎docs/modelling/fulltext.md‎
Lines changed: 150 additions & 0 deletions b/‎docs/modelling/fulltext.md‎
Lines changed: 150 additions & 0 deletions
diff --git a/‎docs/modelling/geospatial.md‎
Lines changed: 101 additions & 0 deletions b/‎docs/modelling/geospatial.md‎
Lines changed: 101 additions & 0 deletions
diff --git a/‎docs/modelling/index.md‎
Lines changed: 21 additions & 0 deletions b/‎docs/modelling/index.md‎
Lines changed: 21 additions & 0 deletions
@@ -0,0 +1,150 @@
+# Full-text data
+
+CrateDB features **native full‑text search** powered by **Apache Lucene** and Okapi BM25 ranking, fully accessible via SQL. You can blend this seamlessly with other data types—JSON, time‑series, geospatial, vectors and more—all in a single SQL query platform.
+
+## 1. Data Types & Indexing Strategy
+
+* By default, all text columns are indexed as `plain` (raw, unanalyzed)—efficient for equality search but not suitable for full‑text queries
+* To enable full‑text search, you must define a **FULLTEXT index** with an optional language **analyzer**, e.g.:
+
+```sql
+CREATE TABLE documents (
+  title       TEXT,
+  body        TEXT,
+  INDEX ft_body USING FULLTEXT(body) WITH (analyzer = 'english')
+);
+```
+
+* You may also define **composite full-text indices**, indexing multiple columns at once:
+
+```sql
+INDEX ft_all USING FULLTEXT(title, body) WITH (analyzer = 'english');
+```
+
+## 2. Index Design & Custom Analyzers
+
+| Component         | Purpose                                                                      |
+| ----------------- | ---------------------------------------------------------------------------- |
+| **Analyzer**      | Tokenizer + token filters + char filters; splits text into searchable terms. |
+| **Tokenizer**     | Splits on whitespace/characters.                                             |
+| **Token Filters** | e.g. lowercase, stemming, stop‑word removal.                                 |
+| **Char Filters**  | Pre-processing (e.g. stripping HTML).                                        |
+
+CrateDB offers **built-in analyzers** for many languages (e.g. English, German, French). You can also **create custom analyzers**:
+
+```sql
+CREATE ANALYZER myanalyzer (
+  TOKENIZER whitespace,
+  TOKEN_FILTERS (lowercase, kstem),
+  CHAR_FILTERS (html_strip)
+);
+```
+
+Or **extend** a built-in analyzer:
+
+```sql
+CREATE ANALYZER german_snowball
+  EXTENDS snowball
+  WITH (language = 'german');
+```
+
+## 3. Querying: MATCH Predicate & Scoring
+
+CrateDB uses the SQL `MATCH` predicate to run full‑text queries against full‑text indices. It optionally returns a relevance score `_score`, ranked via BM25.
+
+**Basic usage:**
+
+```sql
+SELECT title, _score
+FROM documents
+WHERE MATCH(ft_body, 'search term')
+ORDER BY _score DESC;
+```
+
+**Searching multiple indices with weighted ranking:**
+
+```sql
+MATCH((ft_title boost 2.0, ft_body), 'keyword')
+```
+
+**You can configure match options like:**
+
+* `using best_fields` (default)
+* `fuzziness = 1` (tolerate minor typos)
+* `operator = 'AND'` or `OR`
+* `slop = N` for phrase proximity
+
+**Example: Fuzzy Search**
+
+```sql
+SELECT firstname, lastname, _score
+FROM person
+WHERE MATCH(lastname_ft, 'bronw') USING best_fields WITH (fuzziness = 2)
+ORDER BY _score DESC;
+```
+
+This matches similar names like ‘brown’ or ‘browne’.
+
+**Example: Multi‑language Composite Search**
+
+```sql
+CREATE TABLE documents (
+  name        STRING PRIMARY KEY,
+  description TEXT,
+  INDEX ft_en USING FULLTEXT(description) WITH (analyzer = 'english'),
+  INDEX ft_de USING FULLTEXT(description) WITH (analyzer = 'german')
+);
+SELECT name, _score
+FROM documents
+WHERE MATCH((ft_en, ft_de), 'jupm OR verwrlost') USING best_fields WITH (fuzziness = 1)
+ORDER BY _score DESC;
+```
+
+## 4. Use Cases & Integration
+
+CrateDB is ideal for searching **semi-structured large text data**—product catalogs, article archives, user-generated content, descriptions and logs.
+
+Because full-text indices are updated in real-time, search results reflect newly ingested data almost instantly. This tight integration avoids the complexity of maintaining separate search infrastructure.
+
+You can **combine full-text search with other data domains**, for example:
+
+```sql
+SELECT *
+FROM listings
+WHERE 
+  MATCH(ft_desc, 'garden deck') AND
+  price < 500000 AND
+  within(location, :polygon);
+```
+
+This blend lets you query by text relevance, numeric filters, and spatial constraints, all in one.
+
+## 5. Architectural Strengths
+
+* **Built on Lucene inverted index + BM25**, offering relevance ranking comparable to search engines.
+* **Scale horizontally across clusters**, while maintaining fast indexing and search even on high volume datasets.
+* **Integrated SQL interface**: eliminates need for separate search services like Elasticsearch or Solr.
+
+## 6. Best Practices Checklist
+
+| Topic               | Recommendation                                                                     |
+| ------------------- | ---------------------------------------------------------------------------------- |
+| Schema & Indexing   | Define full-text indices at table creation; plain indices are insufficient.        |
+| Language Support    | Pick built-in analyzer matching your content language.                             |
+| Composite Search    | Use multi-column indices to search across title/body/fields.                       |
+| Query Tuning        | Configure fuzziness, operator, boost, and slop options.                            |
+| Scoring & Ranking   | Use `_score` and ordering to sort by relevance.                                    |
+| Real-time Updates   | Full-text indices update automatically on INSERT/UPDATE.                           |
+| Multi-model Queries | Combine full-text search with geo, JSON, numerical filters.                        |
+| Analyze Limitations | Understand phrase\_prefix caveats at scale; tune analyzer/tokenizer appropriately. |
+
+## 7. Further Learning & Resources
+
+* **CrateDB Full‑Text Search Guide**: details index creation, analyzers, MATCH usage.
+* **FTS Options & Advanced Features**: fuzziness, synonyms, multi-language idioms.
+* **Hands‑On Academy Course**: explore FTS on real datasets (e.g. Chicago neighborhoods).
+* **CrateDB Community Insights**: real‑world advice and experiences from users.
+
+## **8. Summary**
+
+CrateDB combines powerful Lucene‑based full‑text search capabilities with SQL, making it easy to model and query textual data at scale. It supports fuzzy matching, multi-language analysis, composite indexing, and integrates fully with other data types for rich, multi-model queries. Whether you're building document search, catalog lookup, or content analytics—CrateDB offers a flexible and scalable foundation.\
@@ -0,0 +1,101 @@
+# Geospatial data
+
+CrateDB supports **real-time geospatial analytics at scale**, enabling you to store, query, and analyze location-based data using standard SQL over two dedicated types: **GEO\_POINT** and **GEO\_SHAPE**. You can seamlessly combine spatial data with full-text, vector, JSON, or time-series in the same SQL queries.
+
+## 1. Geospatial Data Types
+
+### **GEO\_POINT**
+
+* Stores a single location via latitude/longitude.
+* Insert using either a coordinate array `[lon, lat]` or WKT string `'POINT (lon lat)'`.
+* Must be declared explicitly; dynamic schema inference will not detect geo\_point type.
+
+### **GEO\_SHAPE**
+
+* Supports complex geometries (Point, LineString, Polygon, MultiPolygon, GeometryCollection) via GeoJSON or WKT.
+* Indexed using geohash, quadtree, or BKD-tree, with configurable precision (e.g. `50m`) and error threshold
+
+## 2. Table Schema Example
+
+<pre class="language-sql"><code class="lang-sql"><strong>CREATE TABLE parcel_zones (
+</strong>    zone_id INTEGER PRIMARY KEY,
+<strong>    name VARCHAR,
+</strong>    area GEO_SHAPE,
+    centroid GEO_POINT
+)
+WITH (column_policy = 'dynamic');
+</code></pre>
+
+* Use `GEO_SHAPE` to define zones or service areas.
+* `GEO_POINT` allows for simple referencing (e.g. store approximate center of zone).
+
+## 3. Core Geospatial Functions
+
+CrateDB provides key scalar functions for spatial operations:
+
+* **`distance(geo_point1, geo_point2)`** – returns meters using the Haversine formula (e.g. compute distance between two points)
+* **`within(shape1, shape2)`** – true if one geo object is fully contained within another
+* **`intersects(shape1, shape2)`** – true if shapes overlap or touch anywhere
+* **`latitude(geo_point)` / `longitude(geo_point)`** – extract individual coordinates
+* **`geohash(geo_point)`** – compute a 12‑character geohash for the point
+* **`area(geo_shape)`** – returns approximate area in square degrees; uses geodetic awareness
+
+Note: More precise relational operations on shapes may bypass indexes and can be slower.
+
+## 4. Spatial Queries & Indexing
+
+CrateDB supports Lucene-based spatial indexing (Prefix Tree and BKD-tree structures) for efficient geospatial search. Use the `MATCH` predicate to leverage indices when filtering spatial data by bounding boxes, circles, polygons, etc.
+
+**Example: Find nearby assets**
+
+```sql
+SELECT asset_id, DISTANCE(center_point, asset_location) AS dist
+FROM assets
+WHERE center_point = 'POINT(-1.234 51.050)'::GEO_POINT
+ORDER BY dist
+LIMIT 10;
+```
+
+**Example: Count incidents within service area**
+
+```sql
+SELECT area_id, count(*) AS incident_count
+FROM incidents
+WHERE within(incidents.location, service_areas.area)
+GROUP BY area_id;
+```
+
+**Example: Which zones intersect a flight path**
+
+```sql
+SELECT zone_id, name
+FROM flight_paths f
+JOIN service_zones z
+ON intersects(f.path_geom, z.area);
+```
+
+## 5. Real-World Examples: Chicago Use Cases
+
+* **311 calls**: Each record includes `location` as `GEO_POINT`. Queries use `within()` to find calls near a polygon around O’Hare airport.
+* **Community areas**: Polygon boundaries stored in `GEO_SHAPE`. Queries for intersections with arbitrary lines or polygons using `intersects()` return overlapping zones.
+* **Taxi rides**: Pickup/drop off locations stored as geo points. Use `distance()` filter to compute trip distances and aggregate.
+
+## 6. Architectural Strengths & Suitability
+
+* Designed for **real-time geospatial tracking and analytics** (e.g. fleet tracking, mapping, location-layered apps).
+* **Unified SQL platform**: spatial data can be combined with full-text search, JSON, vectors, time-series — in the same table or query.
+* **High ingest and query throughput**, suitable for large-scale location-based workloads
+
+## 7. Best Practices Checklist
+
+<table><thead><tr><th>Topic</th><th width="254">Recommendation</th></tr></thead><tbody><tr><td>Data types</td><td>Declare <code>GEO_POINT</code>/<code>GEO_SHAPE</code> explicitly</td></tr><tr><td>Geometric formats</td><td>Use WKT or GeoJSON for insertions</td></tr><tr><td>Index tuning</td><td>Choose geohash/quadtree/BKD tree &#x26; adjust precision</td></tr><tr><td>Queries</td><td>Prefer <code>MATCH</code> for indexed filtering; use functions for precise checks</td></tr><tr><td>Joins &#x26; spatial filters</td><td>Use within/intersects to correlate spatial entities</td></tr><tr><td>Scale &#x26; performance</td><td>Index shapes, use distance/wwithin filters early</td></tr><tr><td>Mixed-model integration</td><td>Combine spatial with JSON, full-text, vector, time-series</td></tr></tbody></table>
+
+## 8. Further Learning & Resources
+
+* Official **Geospatial Search Guide** in CrateDB docs, detailing geospatial types, indexing, and MATCH predicate usage.
+* CrateDB Academy **Hands-on: Geospatial Data** modules, with sample datasets (Chicago 311 calls, taxi rides, community zones) and example queries.
+* CrateDB Blog: **Geospatial Queries with CrateDB** – outlines capabilities, limitations, and practical use cases (available since version 0.40
+
+## 9. Summary
+
+CrateDB provides robust support for geospatial modeling through clearly defined data types (`GEO_POINT`, `GEO_SHAPE`), powerful scalar functions (`distance`, `within`, `intersects`, `area`), and Lucene‑based indexing for fast queries. It excels in high‑volume, real‑time spatial analytics and integrates smoothly with multi-model use cases. Whether storing vehicle positions, mapping regions, or enabling spatial joins—CrateDB’s geospatial layer makes it easy, scalable, and extensible.
@@ -0,0 +1,21 @@
+# Data modelling
+
+CrateDB provides a unified storage engine that supports different data types.
+```{toctree}
+:maxdepth: 1
+
+relational
+json
+timeseries
+geospatial
+fulltext
+vector
+```
+
+Because CrateDB is a distributed OLAP database designed store large volumes
+of data, it needs a few special considerations on certain details.
+```{toctree}
+:maxdepth: 1
+
+primary-key
+```