optimize source filtering in SourceFieldMapper#81970
optimize source filtering in SourceFieldMapper#81970romseygeek merged 13 commits intoelastic:masterfrom
Conversation
Rally Result of MapFilterRally Result of ParserFilter |
|
Pinging @elastic/es-search (Team:Search) |
romseygeek
left a comment
There was a problem hiding this comment.
This looks great @mushao999, thanks for opening. I'd like @nik9000 to have a look as well before we merge it but LGTM in general. I left a couple of suggestions.
server/src/main/java/org/elasticsearch/index/mapper/SourceFieldMapper.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/mapper/SourceFieldMapper.java
Outdated
Show resolved
Hide resolved
Ah! @romseygeek this is the one I'd hope you'd review. And you already did. Cool. I'll take a look too. |
|
I'm a little surprised there isn't less GC. I mean, there is a little less GC, but I'd hoped for better. I'll bet the GC from the rest of the indexing process is just swamping the gains. |
server/src/main/java/org/elasticsearch/index/mapper/SourceFieldMapper.java
Outdated
Show resolved
Hide resolved
|
I wonder if it's worth moving your change into |
|
@elasticmachine test this please |
Applying the new filtering mechanism throughout all the filtering cases is a good idea. But I do not think |
This is a good point! Can we instead create a new interface |
Thanks, this is a great suggestion. I'v added two commits.
By the way, I think |
romseygeek
left a comment
There was a problem hiding this comment.
I left a couple of comments but this looks good overall. Thanks for being patient with all of our suggestions!
server/src/main/java/org/elasticsearch/common/xcontent/XContentHelper.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/common/xcontent/XContentHelper.java
Outdated
Show resolved
Hide resolved
|
@elasticmachine test this please |
|
The build is failing on checkstyle - can you also run |
|
@romseygeek Thanks for you suggestion. The |
server/src/main/java/org/elasticsearch/common/xcontent/XContentHelper.java
Show resolved
Hide resolved
|
@elasticmachine test this please |
|
@romseygeek There are still two test failed , I will try to fix them tomorrow. Thanks again. |
|
@romseygeek Sorry for the test failure. Tests failed due to null value of xContentType in |
|
@elasticmachine test this please |
romseygeek
left a comment
There was a problem hiding this comment.
LGTM, thanks for your patience on this
This reverts commit 4560a0c. It unexpectedly caused elastic#82891.
This commit introduces XContentFieldFilter, which applies field includes/excludes to XContent without having to realise the xcontent itself as a java map. SourceFieldMapper and ShardGetService are cut over to use this class
@nik9000 and @weizijun did a great job in introducing a memory efficient xcontent filtering approach and benchmarked its peformance (#77154 #81575). From their contribution, source filtering will be faster significantly.
This PR apply this approach into SourceFilterMapper where map filtering is used to include or exclude fields.
We did a benchmark using esrally, and the details list as follow:
--track=nyc_taxis --challenge=append-no-conflicts-index-only --client-options="timeout:60,basic_auth_user:'elastic',basic_auth_password:'Admin@gal'" --track-params='{"bulk_size": 1000, "index_settings": {"index.number_of_replicas": "1", "index.number_of_shards": "6"}, "bulk_indexing_clients": 10}'"exludes": ["surcharge", "pickup_location"]So we believe indexing operation with source filtering will be more memory efficient after this PR being applied.