|
| 1 | +- Title: Geosearch |
| 2 | +- Start Date: 2021-08-02 |
| 3 | +- Specification PR: [#59](https://github.com/meilisearch/specifications/pull/59) |
| 4 | +- Discovery Issue: [#42](https://github.com/meilisearch/product/issues/42) |
| 5 | + |
| 6 | +# Geosearch |
| 7 | + |
| 8 | +## 1. Functional Specification |
| 9 | + |
| 10 | +### I. Summary |
| 11 | + |
| 12 | +The purpose of this specification is to add a first iteration of the **geosearch** feature to give geo-filtering and geosorting capabilities at search time. |
| 13 | + |
| 14 | +#### Summary Key points |
| 15 | + |
| 16 | +- Documents MUST have a `_geo` reserved object to be geosearchable. |
| 17 | +- Filter documents by a given geo radius using the built-in filter `_geoRadius({lat}, {lng}, {distance_in_meters})`. It is possible to cumulate several geosearch filters within the `filter` field. |
| 18 | +- Sort documents in ascending/descending order around a geo point. e.g. `_geoPoint({lat}, {lng}):asc`. |
| 19 | +- It is possible to filter and/or sort by geographical criteria of the user's choice. |
| 20 | +- `_geo` must be set as a filterable attribute to use geo filtering capabilities. |
| 21 | +- `_geo` must be set as a sortable attribute to use geo sort capabilities. |
| 22 | +- There is no `geo` ranking rule that can be manipulated by the user. This one is automatically integrated in the ranking rule `sort` by default and activated by sorting using the `_geoPoint({lat}, {lng})` built-in sort rule. |
| 23 | +- Using `_geoPoint({lat}, {lng})` in the `sort` parameter at search leads the engine to return a `_geoDistance` within the search results. This field represents the distance in meters of the document from the specified `_geoPoint`. |
| 24 | +- Add an `invalid_geo_field` error. |
| 25 | + |
| 26 | +### II. Motivation |
| 27 | + |
| 28 | +According to our user feedback, the lack of a geosearch feature is mentioned as one of the biggest deal-breakers for choosing MeiliSearch as a search engine. A search engine must offer this feature. Some use cases specifically require integrated geosearch capabilities. Moreover, a lot of direct competitors offer it. Users today must find workarounds like using geohash to be able to geosearch documents. We hope to better serve the needs of users by implementing this feature. It allows multiplying the use-cases to which MeiliSearch can respond. |
| 29 | + |
| 30 | +### III. Technical Explanations |
| 31 | + |
| 32 | +#### **As a developer, I want to add geospatial coordinates to a document so that the document can be geosearchable.** |
| 33 | + |
| 34 | +- Introduce a reserved field `_geo` for documents to store geo spatial data from an **object** made of `lat` and `lng` fields for a **JSON format**. |
| 35 | +- Introduce a reserved column `_geo` for documents to store geo spatial data from a **string** made of `lat,lng` for a **CSV format**. |
| 36 | + |
| 37 | +##### **JSON Format** |
| 38 | + |
| 39 | +**`_geo` field definition** |
| 40 | + |
| 41 | +- Name: `_geo` |
| 42 | +- Type: Object |
| 43 | +- Format: `{lat:float, lng:float}` |
| 44 | +- Not required |
| 45 | + |
| 46 | +> 💡 if `_geo` is found in the document payload, `lat` and `lng` are required. |
| 47 | +> 💡 `lat` and `lng` must be of float value. |
| 48 | +
|
| 49 | +##### **CSV Format** |
| 50 | + |
| 51 | +Following the format already defined in the https://github.com/meilisearch/specifications/pull/28/files specification for document indexing from a CSV format. A reserved column `_geo` can be added to specify the geographical coordinates of a document. |
| 52 | + |
| 53 | +csv format example |
| 54 | +``` |
| 55 | +"id:number","label","brand","_geo" |
| 56 | +"1","F40","Ferrari","48.862725,2.287592" |
| 57 | +``` |
| 58 | + |
| 59 | +**`_geo` column definition** |
| 60 | + |
| 61 | +- Name: `_geo` |
| 62 | +- Type: String |
| 63 | +- Format: `"lat:float,lng:float"` |
| 64 | +- Not required |
| 65 | + |
| 66 | +#### POST Add or replace documents `/indexes/{indexUid}/documents` |
| 67 | + |
| 68 | +##### Request body |
| 69 | +``` |
| 70 | +[ |
| 71 | + { |
| 72 | + "id": 1, |
| 73 | + "label": "F40", |
| 74 | + "brand": "Ferrari", |
| 75 | + "_geo": { |
| 76 | + "lat": 48.862725, |
| 77 | + "lng": 2.287592 |
| 78 | + } |
| 79 | + } |
| 80 | +] |
| 81 | +``` |
| 82 | + |
| 83 | +##### 202 Accepted - Response body |
| 84 | + |
| 85 | +``` |
| 86 | +{ |
| 87 | + "updateId": 1 |
| 88 | +} |
| 89 | +``` |
| 90 | + |
| 91 | +#### PUT Add or replace documents `/indexes/{indexUid}/documents` |
| 92 | + |
| 93 | +##### Request body |
| 94 | +``` |
| 95 | +[ |
| 96 | + { |
| 97 | + "id": 1, |
| 98 | + "brand": "F40 LM", |
| 99 | + "brand": "Ferrari", |
| 100 | + "_geo": { |
| 101 | + "lat": 48.862725, |
| 102 | + "lng": 2.287592 |
| 103 | + } |
| 104 | + } |
| 105 | +] |
| 106 | +``` |
| 107 | + |
| 108 | +##### 202 Accepted - Response body |
| 109 | + |
| 110 | +``` |
| 111 | +{ |
| 112 | + "updateId": 2 |
| 113 | +} |
| 114 | +``` |
| 115 | + |
| 116 | +> 🔴 Giving a bad formed `_geo` that do not conform to the format causes the `update` payload to fail. A new `invalid_geo_field` error is given in the `update` object. |
| 117 | +
|
| 118 | +##### Errors Definition |
| 119 | + |
| 120 | +## invalid_geo_field |
| 121 | + |
| 122 | +### Context |
| 123 | + |
| 124 | +This error occurs when the `_geo` field of a document payload is not valid. |
| 125 | + |
| 126 | +### Error Definition |
| 127 | + |
| 128 | +```json |
| 129 | +{ |
| 130 | + "message": "The _geo field is invalid. :syntaxErrorHelper.", |
| 131 | + "code": "invalid_geo_field", |
| 132 | + "type": "invalid_request", |
| 133 | + "link": "https://docs.meilisearch.com/errors#invalid_geo_field" |
| 134 | +} |
| 135 | +``` |
| 136 | + |
| 137 | +- The `:syntaxErrorHelper` is inferred when a syntax error is encountered. |
| 138 | + |
| 139 | +--- |
| 140 | + |
| 141 | +### **As an end-user, I want to filter documents within a geo radius.** |
| 142 | + |
| 143 | +- Introduce a `_geoRadius({lat}, {lng}, {distance_in_meters})` built-in filter rule to `filter` documents in a geo radius.shape. |
| 144 | + |
| 145 | +**`_geoRadius` built-in filter rule definition** |
| 146 | + |
| 147 | +- Name: `_geoRadius` |
| 148 | +- Signature: ({lat:float}:required, {lng:float}:required, {distance_in_meters:int}:required) |
| 149 | +- Not required |
| 150 | +- `distance_in_meters` only accepts positive value. |
| 151 | + |
| 152 | +> The `_geo` field has to be set in `filterableAttributes` setting by the developer to activate geo filtering capabilities at search. |
| 153 | +
|
| 154 | +#### GET Search `/indexes/{indexUid}/search` |
| 155 | + |
| 156 | +``` |
| 157 | +...&filter="brand=Mercedes AND _geoRadius(48.862725, 2.287592, 2000)"` |
| 158 | +``` |
| 159 | + |
| 160 | +#### POST Search `/indexes/{indexUid}/search` |
| 161 | + |
| 162 | +``` |
| 163 | +{ |
| 164 | + "filter": ["brand = Ferrari", "_geoRadius(48.862725, 2.287592, 2000)"] |
| 165 | +} |
| 166 | +``` |
| 167 | + |
| 168 | +> 🔴 Specifying parameters that do not conform to the `_geoRadius` signature causes the API to return an `invalid_filter` error. The error message should indicate how `_geoRadius` should be used. See `_geoRadius` built-in filter rule definition part. |
| 169 | +
|
| 170 | +--- |
| 171 | + |
| 172 | +### **As an end-user, I want to sort documents around a geo point.** |
| 173 | + |
| 174 | +- Introduce a `_geoPoint({lat}, {lng})` function parameter to `sort` documents around a central point. |
| 175 | + |
| 176 | +**`_geoPoint` built-in sort rule definition** |
| 177 | + |
| 178 | +- Name: `_geoPoint` |
| 179 | +- Signature: ({lat:float}:required, {lng:float}:required) |
| 180 | +- Not required |
| 181 | + |
| 182 | +Following the [`sort` specification feature](https://github.com/meilisearch/specifications/pull/55): |
| 183 | +> The `_geo` field has to be set in `sortableAttributes` setting by the developer to activate geo sorting capabilities at search. |
| 184 | +> |
| 185 | +>There is no `geo` ranking rule as such. It is in fact within the `sort` ranking rule in an obfuscated way. |
| 186 | +> |
| 187 | +>`_geoPoint` built-in sort rule can sort documents in ascending or descending order. See Technical Aspects part. |
| 188 | +
|
| 189 | +#### GET Search `/indexes/{indexUid}/search` |
| 190 | + |
| 191 | +``` |
| 192 | + ...&sort=_geoPoint({lat, lng}):asc,price:desc |
| 193 | +``` |
| 194 | + |
| 195 | +#### POST Search `/indexes/{indexUid}/search` |
| 196 | + |
| 197 | +``` |
| 198 | +{ |
| 199 | + "sort": "_geoPoint({lat, lng}):asc,price:desc" |
| 200 | +} |
| 201 | +``` |
| 202 | +> 🔴 Specifying parameters that do not conform to the `_geoPoint` signature causes the API to return an `invalid_sort` error. The error message should indicate how `_geoPoint` should be used. See `_geoPoint` built-in sort rule definition part. |
| 203 | +
|
| 204 | +--- |
| 205 | + |
| 206 | +### **As an end-user, I want to know the document distance when I am sorting around a geo point.** |
| 207 | + |
| 208 | +- Introduce a `_geoDistance` parameter to the search result `hit` object. |
| 209 | + |
| 210 | +**`_geoDistance` field definition** |
| 211 | + |
| 212 | +- Name: `_geoDistance` |
| 213 | +- Description: Return document distance when the end-user sorts document from a `_geoPoint` in meters. |
| 214 | +- Type: int |
| 215 | +- Not required |
| 216 | + |
| 217 | +> 💡 `_geoDistance` response field is only computed and shown when the end-user have sorted documents around a `_geoPoint`. So if the end-user filters documents using a `_geoRadius` built-in filter without sorting them around a `_geoPoint`, this field `_geoDistance` will not appear in the search response. |
| 218 | +
|
| 219 | +### IV. Finalized Key Changes |
| 220 | + |
| 221 | +- Add a `_geo` reserved field on JSON and CSV format to index a geo point coordinates for a document. |
| 222 | +- Add a `_geoPoint(lat, lng)` built-in sort rule. |
| 223 | +- Add a `_geoRadius(lat, lng, distance_in_meters)` built-in filter rule. |
| 224 | +- Return a `_geoDistance` in `hits` objects representing the distance in meters computed from the `_geoPoint` built-in sort rule. |
| 225 | + |
| 226 | +## 2. Technical Aspects |
| 227 | + |
| 228 | +### I. :desc case - Sorting documents around a geo point |
| 229 | + |
| 230 | +We may encounter technical difficulties to implement a descending order capability for the geo sorting. This first iteration will allow us to identify if this is a real technical problem. If we verify the existence of this problem, we will think at this moment of the best solution to bring on the table. |
| 231 | + |
| 232 | +> 💡 In a first step, we could not allow `:desc` on a geoPoint if it's a complex technical issue. |
| 233 | +
|
| 234 | +### II. Measuring |
| 235 | + |
| 236 | +- `filterableAttribute` setting definition to evaluate `_geo` presence. |
| 237 | +- `sortableAttribute` setting definition to evaluate `_geo` presence. |
| 238 | + |
| 239 | +## 3. Future Possibilities |
| 240 | + |
| 241 | +- Add built-in filter to filter documents within `polygon` and `bounding-box`. |
| 242 | +- Handling array of geo points in the document object. |
| 243 | +- Handling multiple geo formats for the `_geo` field. e.g. "{lat},{lng}", a geohash etc.. |
0 commit comments