ESQL: Move most esql-core project classes to esql#139248
ESQL: Move most esql-core project classes to esql#139248ivancea merged 19 commits intoelastic:mainfrom
Conversation
|
|
||
| dependencies { | ||
| api "org.antlr:antlr4-runtime:${versions.antlr4}" | ||
| api project(path: xpackModule('mapper-version')) |
There was a problem hiding this comment.
For some reason I can't move this dependency to esql, or tests like EsqlSpecIt fail with:
fatal exception while booting Elasticsearch java.lang.IllegalArgumentException: error loading whitelist(s) [org.elasticsearch.xpack.versionfield.txt]:[10]
at org.elasticsearch.painless@9.3.0-SNAPSHOT/org.elasticsearch.painless.lookup.PainlessLookupBuilder.buildFromWhitelists(PainlessLookupBuilder.java:163)
...
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.bootstrap.Elasticsearch$1.<init>(Elasticsearch.java:405)
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.bootstrap.Elasticsearch.initPhase3(Elasticsearch.java:405)
at org.elasticsearch.server@9.3.0-SNAPSHOT/org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:100)
Caused by: java.lang.IllegalArgumentException: class [org.elasticsearch.xpack.versionfield.VersionScriptDocValues] cannot represent multiple java classes with the same name from different class loaders
at org.elasticsearch.painless@9.3.0-SNAPSHOT/org.elasticsearch.painless.lookup.PainlessLookupBuilder.lookupException(PainlessLookupBuilder.java:268)
...
There was a problem hiding this comment.
I also tried removing the plugin config, as this is barely a lib now, but there were different problems I couldn't solve. I'm keeping this as it is for now; can be improved later without adding too much noise; this PR has too many changed files already, and blocks other features
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
There was a problem hiding this comment.
Pull request overview
This PR moves most classes from the esql-core project to the esql project under the same org.elasticsearch.xpack.esql.core package to resolve circular dependencies. The esql-core project originated from the ql project and was intended to be shared code, but has become 100% specific to ESQL. This refactoring addresses artificial circular dependencies between esql and esql-core by consolidating them into a single "planning" module, making it easier to pass configuration context through functions and simplifying the overall architecture.
Key changes:
- Most classes moved from
esql-coretoesqlunder theorg.elasticsearch.xpack.esql.corepackage - Removed unused
mapping-*.jsonresources; required ones placed under test-fixtures - Updated gradle dependencies accordingly
- Left only minimal shared code (utils and exceptions) in
esql-core
Reviewed changes
Copilot reviewed 41 out of 163 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| TestUtils.java | Utility methods for creating test fixtures (Literals, FieldAttributes) |
| RemoteClusterUtilsTests.java | Tests for remote cluster string parsing utilities |
| QueriesTests.java | Tests for query combination logic |
| SourceTests.java | Tests for Source equality, hashing, and serialization |
| NodeTests.java | Tests for Node tree structure and toString rendering |
| NodeSubclassTests.java | Shim class exposing protected Node methods for testing |
| AbstractNodeTestCase.java | Base test class for Node subclasses |
| LeafQueryTests.java | Tests for leaf query equality and negation |
| StringPatternTests.java | Tests for wildcard and regex pattern matching |
| FunctionTestUtils.java | Utility for creating Literal test fixtures |
| NullabilityTests.java | Tests for expression nullability logic |
| LiteralTests.java | Tests for Literal equality, transformation, and type conversion |
| FoldContextTests.java | Tests for fold context memory tracking |
| ExpressionIdTests.java | Tests for unique NameId generation |
| AttributeSetTests.java | Tests for AttributeSet equality and immutability |
| AttributeMapTests.java | Tests for AttributeMap operations and resolution |
| StoredAsyncResponseTests.java | Serialization tests for async responses |
| AsyncTaskManagementServiceTests.java | Tests for async task lifecycle and management |
| ReflectionUtils.java | Reflection utilities for detecting generic type parameters |
| Queries.java | Utilities for combining Elasticsearch queries |
| PlanStreamOutput.java | Interface for serializing plan components with caching |
| PlanStreamInput.java | Interface for deserializing plan components with caching |
| DateUtils.java | Date/time formatting and parsing utilities |
| CollectionUtils.java | Collection manipulation utilities |
| Check.java | Runtime assertion utilities |
| UnsupportedEsField.java | Representation of unsupported field types |
| TextEsField.java | Text field type with exact match handling |
| SupportedVersion.java | Version support tracking for data types |
| StringUtils.java | String conversion utilities for various types |
| PotentiallyUnmappedKeywordEsField.java | Marker for potentially unmapped keyword fields |
| MultiTypeEsField.java | Multi-type field resolution support |
| KeywordEsField.java | Keyword field type representation |
| InvalidMappedField.java | Invalid/incompatible field mapping representation |
| FunctionEsField.java | Field with block loader function |
| EsField.java | Base field type representation |
| DateEsField.java | Date field type representation |
| DataTypeConverter.java | Type conversion utilities |
| DataType.java | Core data type enum with comprehensive type system |
| Converter.java | Conversion interface |
| Source.java | Source location tracking with serialization |
| NodeUtils.java | Node diff and string utilities |
| NodeInfo.java | Node property and transformation metadata |
| Node.java | Base immutable tree structure |
| Location.java | Line/column location representation |
| WildcardQuery.java | Wildcard pattern query |
| TermsQuery.java | Terms query for multiple values |
| TermQuery.java | Single term query |
| RegexQuery.java | Regular expression query |
| RangeQuery.java | Range query with bounds |
| QueryStringQuery.java | Full query string query with options |
| Query.java | Base query class with negation support |
| PrefixQuery.java | Prefix matching query |
| NotQuery.java | Query negation |
| MatchAll.java | Match all documents query |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
astefan
left a comment
There was a problem hiding this comment.
This looks ok with some comments:
- The tests that now use their own
Enums to consider different data types because theDataTypeclass is not visible anymore is kind of ugly and not future proof. I don't know how likely (but does it matter?) it is for future us to change something related to data types, but if we do, those tests that now use customEnums instead of the real deal will be a pain to maintain, or even catch that the future change inDataTypeis or is not reflected in those unit tests affected by this PR. I would be cautious and save ourselves a future pita issue, if there is another way to keep the unit tests the same or have them be tied to the actual DataTypes they are using. - One of the json files that you deleted -
fc-incompatible-object-compatible-subfields.jsoncame from this PR - https://github.com/elastic/elasticsearch/pull/100875/files#diff-70d0893691f1bd1e3b4be7b7146fbbbb1168851e8cdcf9ace6c26e5b6839b1a9 - where Costin fixed a major bug that was affecting multiple projects. For ESQL though, he added a test inEsqlActionITthat doesn't seem to accurately reflect the mappings from the now deleted json file. Please, double check that we test (or at least that the test that is now gone is relevant for ESQL) that specific mapping in ESQL.
|
Hey, thanks for the review @astefan!
I'm 50/50 with this. Those tests shouldn't have had DataTypes to begin with. They were used because, well, the compute classes are usually created based on the original datatype. But I look at it more like "reusing" than "needing" or even "being tied to". The other way would be to keep DataType in the core package, with some patches to make it work. But I think that would be worse? cc @nik9000 in case he has some idea about these.
I see it's only being used in a SQL test; you mean that we should add a similar test for ESQL? I'm not aware of their similarities with ESQL; I would probably add an issue to test that part with a link to the removed JSON. It could be kept, but until used, and given that it's a resource (Harder to track if unused), I would still remove it here. |
|
|
||
| dependencies { | ||
| api "org.antlr:antlr4-runtime:${versions.antlr4}" | ||
| api project(path: xpackModule('mapper-version')) |
| } | ||
|
|
||
| tasks.named('splitPackagesAudit').configure { | ||
| // TODO: Remove these exclusions by renaming the esql-core packages |
There was a problem hiding this comment.
Or removing the remaining esql-core classes.
There was a problem hiding this comment.
They're used within compute. I could either move them there (A bit meh, but would work), or rename the package (For purely technical reasons, yeah...).
Other options? Movingthem to compute would help us get rid of the package (I think; the "plugin" class isn't referenced anywhere, but I know random things fail if I remove it, like cluster tests).
In any case, I'll do it in another PR for organization
| DataType.DataTypesTransportVersions.ESQL_AGGREGATE_METRIC_DOUBLE_CREATED_VERSION, | ||
| // Same as DataType.DataTypesTransportVersions.ESQL_AGGREGATE_METRIC_DOUBLE_CREATED_VERSION, | ||
| // copied here to avoid a circular dependency | ||
| TransportVersion.fromName("esql_aggregate_metric_double_created_version"), |
There was a problem hiding this comment.
Can we/should we move this into something in compute? Maybe on the block type.
There was a problem hiding this comment.
We should be able to do so. And the fact that it's needed in compute could mean it makes sense. I'll take a look
There was a problem hiding this comment.
I just found we have 2 TVs: One on the datatype, and other (a bit older) on the block. Given this is a compute-only test, and the actual serialization implementation uses the block TV, I'm guessing it was a typo. Tested it with that fixed version just in case, and it works well.
Edit: Confirmed with Jonas Kunz that this is ok
| * Different from DataType to avoid circular dependencies. | ||
| * </p> | ||
| */ | ||
| private enum DataType { |
There was a problem hiding this comment.
I think calling this DataType might confuse things. ValueType might be better.
This was always different from esql's DataType because it had fewer distinct values. We were mostly using it so we could have a special case for the fixed-length-BytesRefs that are IPs. I think this is fine - we might add more to this, but we're not using any covering switch statements.
We could actually do that after your change - make a for (ValueType type: ValueTypes.values()) in the parameters. Though it looks like boolean is excluded right now.
There was a problem hiding this comment.
Updated it. I could also add the for; the BOOLEAN was probably forgotten; the test works well with it.
|
Let's hold merging this until you get approval from Andrei as well. |
Ok. I've checked this and we have this mapping configuration tested in |
# Context
The esql-core project comes from the ql project, which holds the shared code between eql, sql (and eventually esql). But we wanted to make greater changes, so we cloned it. esql-core is 100% for esql now.
It exists there until a better migration is made, as there are dead classes and dead code not used in esql.
Also, it's used from the esql:compute project. Basically, it also serves as the "shared" part of the projects.
# What is this doing?
1. Move most classes (Excepot for some utils and exception classes) to esql, unther the same esql.core package
2. Remove DataType usage from some compute tests
3. Move resources and gradle dependencies from esql-core to esql
- Remove unused `mapping-*.json` resources, and place the required ones under test-fixtures
The esql-core is left now with the minimum code shared with compute.
Context
The esql-core project comes from the ql project, which holds the shared code between eql, sql (and eventually esql). But we wanted to make greater changes, so we cloned it. esql-core is 100% for esql now.
It exists there until a better migration is made, as there are dead classes and dead code not used in esql.
Also, it's used from the esql:compute project. Basically, it also serves as the "shared" part of the projects.
What is this doing?
mapping-*.jsonresources, and place the required ones under test-fixturesThe esql-core is left now with the minimum code shared with compute.
Why this change, why now?
TL;DR: Circular dependencies between esql and esql-core. Dependencies that are purely artificial, as they should be the same "planning" module.
Long story, in chronological order:
Also, I think this is a nice step towards making it easy to "cleanup" these classes. Or at least, it's not negative.
To be done after merging
splitPackagesAuditreports as an error having packages with classes in multiple projects. I added exclusions for the current ones, as to avoid changing many files. This will be removed in another PR, and the package renamed to "core.shared" or similar