Support query pass-through for Elasticsearch connector#12324
Support query pass-through for Elasticsearch connector#12324kasiafi merged 1 commit intotrinodb:masterfrom
Conversation
.../trino-elasticsearch/src/main/java/io/trino/plugin/elasticsearch/ElasticsearchConnector.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
It would be better to verify the result in my opinion and how about formatting like this?
assertThat(
"SELECT * FROM " +
"TABLE(\"" + getSession().getCatalog().orElseThrow() + "\".\"system\".\"remote_query\"(" +
" \"schema\" => 'tpch'," +
" \"index\" => 'nation'," +
" \"query\" => '{\"query\": {\"match\": {\"name\": \"ALGERIA\"}}}'" +
"))")
.matches(...);By the way, I expected the above SELECT query returns the result with multiple columns, but didn't. Is it intentional design? Also, I would recommend adding empt result & failure case.
There was a problem hiding this comment.
That's intentional. Pass-through queries in Elasticsearch return a single VARCHAR column as a result.
There was a problem hiding this comment.
@ebyhr I added more tests including no match and failure.
plugin/trino-elasticsearch/src/main/java/io/trino/plugin/elasticsearch/ptf/RemoteQuery.java
Outdated
Show resolved
Hide resolved
martint
left a comment
There was a problem hiding this comment.
We should deprecate the old-style passthrough queries in favor of this.
plugin/trino-elasticsearch/src/main/java/io/trino/plugin/elasticsearch/ptf/RemoteQuery.java
Outdated
Show resolved
Hide resolved
...n/trino-elasticsearch/src/main/java/io/trino/plugin/elasticsearch/ElasticsearchMetadata.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
That's intentional. Pass-through queries in Elasticsearch return a single VARCHAR column as a result.
1eb57db to
4272b0a
Compare
martint
left a comment
There was a problem hiding this comment.
Also, don't forget to update the docs or file an issue to do it later so we don't lose track of it.
There was a problem hiding this comment.
I don't think this is such a good idea. It masks the underlying assertion, so it's hard to tell what was different and it what way. It also doesn't scale when there are more than a couple of alternatives.
Instead, modify the query to normalize the results in some way. For instance, parse the resulting json, extract the values and sort them to get a consistent result.
4272b0a to
bb850c7
Compare
|
@martint I applied comments. Please take a look. |
There was a problem hiding this comment.
due to the fact that the function seems to dump ES result, i would consider calling a raw_query
(especially so that we keep query name free for future)
There was a problem hiding this comment.
That's a good point. In ES parlance, queries are "search queries" (https://www.elastic.co/guide/en/elasticsearch/reference/current/search-your-data.html), and there's a Query DSL (https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl.html) to formulate them (what we support here). They also have two other languages:
- EQL, for querying log and event data (https://www.elastic.co/guide/en/elasticsearch/reference/current/eql.html)
- SQL (https://www.elastic.co/guide/en/elasticsearch/reference/current/sql-overview.html)
bb850c7 to
85721e4
Compare
|
@martint @findepi I renamed the function to I also added a test to compare the results between the legacy pass-through mechanism, and the new table function. The old query pass-through will be deprecated in the documentation when we document the table function. @martint is there any other way we want to deprecate it than a note in the docs? |
you can also mark related pieces of code as |
85721e4 to
1cd499d
Compare
This PR introduces
raw_queryPolymorphic Table Function for full query pass-through to Elasticsearch connector.Documentation
Documentation issue is filed: #13007.
When documenting, also deprecate the legacy query pass-through mechanism: https://trino.io/docs/current/connector/elasticsearch.html?highlight=elasticsearch#pass-through-queries