-
-
Notifications
You must be signed in to change notification settings - Fork 224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Elasticsearch 7 Support #831
Comments
This is the first step in supporting Elasticsearch 7. At this time, Pelias does not work out of the box on ES7, but with a Docker image ready to go, we can begin testing changes for compatibility. This Dockerfile and config is identical to the ES6 Docker image, except for changing the version, and making one update to the `elasticsearch.yml`: In ES7, the bulk thread pool is removed, and both bulk and non-bulk operations go through a single [write](https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-threadpool.html#modules-threadpool) thread pool. For Pelias we have found increasing the queue size of this thread pool is useful to ensure imports can succeed without errors, so the configuration file has been updated accordingly. Connects pelias/pelias#831
This is necessary for Elasticsearch 7 Connects pelias/pelias#831
With the list of changes above as of this writing, an ES7 build and an import of a few million records for the Portland Metro area works well, and querying with the latest API causes no errors. I'm sure there's more work to do, in particular I think at least one geo query related change will be required, but it looks like the core part of the ES7 upgrade is now fairly well understood! 🎉 |
The first error seen when trying to use our current schema with Elasticsearch 7 is: ``` [illegal_argument_exception] Token filter [word_delimiter] cannot be used to parse synonyms ``` The [word delimiter](https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-word-delimiter-tokenfilter.html) token filter is only used in one place: the `peliasAdmin` analyzer. Looking at the documentation for `word_delimiter`, it does _a lot_: splitting words, handling punctuation, and even some basic stemming. It really feels like an extremely broad tool and at this point feels like something that Elasticsearch would deprecate in the future. Furthermore, looking at our integration tests, it seems one of the key reasons we used it was to tokenize on hyphens, which we have done using the `peliasNameTokenizer` since #375. Considering how complicated this token filter is, and how it's now being used with relatively little effect, it seems like something we can remove. Connects pelias/pelias#831
after merging pelias/schema#403 its now possible to create indices on ES 6.8.5 which will be compatible with 7.4.2 |
For the adventurous among you, we have a prelease At a minimum you should ensure that you've made the following configuration changes for ES7:
|
Elasticsearch 6 is now supported, we are working on ES7! Connects pelias/pelias#719 Connects pelias/pelias#831
In Elasticsearch 7+, the [hits count is now an object](https://www.elastic.co/guide/en/elasticsearch/reference/current/breaking-changes-7.0.html#hits-total-now-object-search-response). This was needed because Elasticsearch now includes a performance improvement that allows non-exact hit counts to be used when the exact count isn't needed. This adds a helper to wrap around the breaking change and support either the old or new format. Extracted from #394 Connects pelias/pelias#831
BREAKING CHANGE This drops Elasticsearch 5 from the test matrix for this repo. While it makes no other changes, it's a breaking change as it marks the point where we stop supporting ES5. This allows us to make changes that support only ES6/7 as we add support for Elasticsearch 7. Connects pelias/pelias#831
BREAKING CHANGE: This drops Elasticsearch 5 from the test matrix for this repo. While it makes no other changes, it's a breaking change as it marks the point where we stop supporting ES5. This allows us to make changes that support only ES6/7 as we add support for Elasticsearch 7. Connects pelias/pelias#831
Now that we have dropped support for ES5, we can change the default type name from `doc` to `_doc`. Either setting is compatible with ES6, but only `_doc` is compatible with ES7. Connects pelias/pelias#831
This drops support for testing the schema with the Elasticsearch type name set to `doc`. This was only needed to support Elasticsearch 5. In order to support Elasticsearch 7, we'll no longer be supporting ES5. Connects pelias/pelias#831
This adds Elasticsearch 7.5.1 to the CI test matrix. Until we merge complete support for ES7, this CI run is allowed to fail without failing the entire build. Connects pelias/pelias#831
This adds Elasticsearch 7.5.1 to the CI test matrix. Until we merge complete support for ES7, this CI run is allowed to fail without failing the entire build. Connects pelias/pelias#831
In ES7, specifying a mapping type name will no longer be allowed. ES6 can emulate this behavior by setting the `include_type_name` parameter to `false` when creating and fetching mappings. This PR sets that parameter so that our mapping format is compatible with ES6, while using the ES7 preferred format. In the future, when we wish to drop support for ES6, we'll only have to stop using the `include_type_name` configuration option. Connects pelias/pelias#831
In ES7, specifying a mapping type name will no longer be allowed. ES6 can emulate this behavior by setting the `include_type_name` parameter to `false` when creating and fetching mappings. This PR sets that parameter so that our mapping format is compatible with ES6, while using the ES7 preferred format. In the future, when we wish to drop support for ES6, we'll only have to stop using the `include_type_name` configuration option. Connects pelias/pelias#831
In ES7, specifying a mapping type name will no longer be allowed. ES6 can emulate this behavior by setting the `include_type_name` parameter to `false` when creating and fetching mappings. This PR sets that parameter so that our mapping format is compatible with ES6, while using the ES7 preferred format. In the future, when we wish to drop support for ES6, we'll only have to stop using the `include_type_name` configuration option. Connects pelias/pelias#831
The `_all` field was deprecated in Elasticsearch 6 and completely removed in [Elasticsearch 7](https://www.elastic.co/guide/en/elasticsearch/reference/current/breaking-changes-7.0.html#all-meta-field-removed). Pelias has disabled this field for quite some time, however now that we have dropped support for ES5, we can remove this configuration option. This also moves us towards supporting ES7! Connects pelias/pelias#831
As of `pelias-config-4.8.0` we are now using the new Elasticsearch 7 compatible default document type name: `_doc`. Now that we have dropped support for ES5, we want to ensure this value is the default going forward. Connects pelias/config#122 Connects pelias/pelias#831
Now that we have merged pelias/config#122 and associated PRs to add Elasticsearch 7 support to Pelias master branches, we no longer need to override the `typeName` config option in project `pelias.json` configration. This reverts commit acc4202. Connects pelias/pelias#831
Now that we have merged pelias/config#122 and associated PRs to add Elasticsearch 7 support to Pelias master branches, we no longer need to override the `typeName` config option in project `pelias.json` configration. This reverts commit acc4202. Connects pelias/pelias#831
As of today, Elasticsearch 7 support has been merged to the We'll follow up with additional changes or improvements as needed, but in our testing so far, ES7 appears to perform well. We'll also soon start rolling out ES7 as the default to many of the regional projects in the Pelias Docker project. |
This should support at least both ES6 and ES7 counts. Connects pelias/pelias#831
Its done :) Connects pelias/pelias#831
I think this is done now! 🎉 In our testing ES7 performs well compared to ES6. There are no changes in query behavior and performance is the same or slightly better. Please reach out to us if you find any ES7 related bugs! ES6 will continue to be supported for some time, but ES7 is now the recommended version! |
@orangejulius I was trying to figure out if updating to a newer version of Pelias should imply data changes in ES and couldn't find this in any release notes / doc page. Is that a thing? And do you think maybe there should be such a page? I can add it to the docs repo via a PR (just need some tips re: its contents). |
@mihneadb good point. We definitely recommend starting fresh when updating to a new version, but you can follow the general Elasticsearch compatibility, where in general Elasticsearch can read indices created from one prior major version. In our experience, using an index created in ES5 with ES6 lead to performance issues, but using an index created in ES6 with ES7 has worked fine. |
This has been the case for some time. Connects pelias/pelias#831
Just in case anyone comes to this issue looking, Elasticsearch 7 is currently not only the highest supported, but also the recommended version of Elasticsearch to use. Until Elasticsearch 8 comes out and we add support for it, that will continue to be the case. |
This is necessary for Elasticsearch 7 Connects pelias/pelias#831
The first error seen when trying to use our current schema with Elasticsearch 7 is: ``` [illegal_argument_exception] Token filter [word_delimiter] cannot be used to parse synonyms ``` The [word delimiter](https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-word-delimiter-tokenfilter.html) token filter is only used in one place: the `peliasAdmin` analyzer. Looking at the documentation for `word_delimiter`, it does _a lot_: splitting words, handling punctuation, and even some basic stemming. It really feels like an extremely broad tool and at this point feels like something that Elasticsearch would deprecate in the future. Furthermore, looking at our integration tests, it seems one of the key reasons we used it was to tokenize on hyphens, which we have done using the `peliasNameTokenizer` since #375. Considering how complicated this token filter is, and how it's now being used with relatively little effect, it seems like something we can remove. Connects pelias/pelias#831
* feat(elasticsearch): Default to `_doc` as type name for ES7 support Now that we have dropped support for ES5, we can change the default type name from `doc` to `_doc`. Either setting is compatible with ES6, but only `_doc` is compatible with ES7. Connects pelias/pelias#831 * chore(CI): Remove deprecated `matrix` section Connects pelias/pelias#850 * feat(config): Default `whosonfirst.importPostalcodes` to true These take very little additional space, and are quite useful. We should have enabled this a long time ago. Closes pelias#61 * fix(esclient): default esclient.apiVersion to 7.x * feat: remove `imports.whosonfirst.importVenues` * feat(Node.js): Drop support for Node.js 8 Node.js 8 is no longer supported as it reached [end of life](https://github.com/nodejs/Release#release-schedule) at the end of 2019. Connects pelias/pelias#837 * feat: Enable Postal Cities by default For quite a while now we've had a solution to the "Postal Cities problem" (pelias/pelias#396), but it was disabled by default. Enough time has passed that it should probably be enabled. Closes pelias/pelias#396 * fix(get): support for lodash get defaultValue * removed auth * fix syntax Co-authored-by: Julian Simioni <[email protected]> Co-authored-by: Julian Simioni <[email protected]> Co-authored-by: missinglink <[email protected]> Co-authored-by: Joxit <[email protected]>
This is the first step in supporting Elasticsearch 7. At this time, Pelias does not work out of the box on ES7, but with a Docker image ready to go, we can begin testing changes for compatibility. This Dockerfile and config is identical to the ES6 Docker image, except for changing the version, and making one update to the `elasticsearch.yml`: In ES7, the bulk thread pool is removed, and both bulk and non-bulk operations go through a single [write](https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-threadpool.html#modules-threadpool) thread pool. For Pelias we have found increasing the queue size of this thread pool is useful to ensure imports can succeed without errors, so the configuration file has been updated accordingly. Connects pelias/pelias#831
Now that we have merged pelias/config#122 and associated PRs to add Elasticsearch 7 support to Pelias master branches, we no longer need to override the `typeName` config option in project `pelias.json` configration. This reverts commit acc4202. Connects pelias/pelias#831
This issue will track support for Elasticsearch 7 in Pelias.
Most Elasticsearch upgrades require two sets of changes:
Pelias Tasks
Here's the list of breaking changes we'll need to adapt to (this list will be updated over time):
[illegal_argument_exception] Token filter [word_delimiter] cannot be used to parse synonyms
when using pelias/schema (possibly related to Correct our use of synonyms for ES6 schema#381). solved in backport some features from the ES7 branch which may be ok to use now schema#403, another solution proposed in feat(peliasAdmin): Remove word delimiter filter schema#392doc
mapping type in pelias-model and pelias-schema_all
field. The_all
field is always disabled in ES6+, so this code isn't needed once we are on ES6. It is incompatible with ES7. (done in feat(es7): remove _all mapping schema#424)Reference links
The text was updated successfully, but these errors were encountered: