Skip to content

Conversation

@weizijun
Copy link
Contributor

@weizijun weizijun commented Jan 31, 2022

About TSDB index.routing_path setting improvement in #82511
The Point 1 and 3 is doing improvement by Nik.
This PR is about the point 2 improvement.

ignore object type check:

  • It will cause that some mistake. e.g: {"foo":{"bar":"xxx"}} , index.routing_path : "foo.*" is ok, but index.routing_path : "foo*" is not ok.
  • I think we can ignore object type, because the routing_path will check the field if it has time_series_dimension parameter, as object can't set the parameter, it can ignore the object check.

As Nik strongly recommends rejecting the matching object in the routing generation code. I do a little improvement.
Object type is rejected when the routing_path patten is equals the object name. e.g:
routingPath=foo* is ok, but routingPath=foo is not ok.

@elasticsearchmachine elasticsearchmachine added v8.1.0 external-contributor Pull request authored by a developer outside the Elasticsearch team labels Jan 31, 2022
@fcofdez fcofdez added :StorageEngine/TSDB You know, for Metrics Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) labels Jan 31, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (Team:Analytics)

@nik9000 nik9000 self-requested a review January 31, 2022 23:05
@nik9000
Copy link
Member

nik9000 commented Feb 1, 2022

@elasticmachine, test this please.

I'll be this is going to cause "fun" with #83148.

@nik9000 nik9000 self-assigned this Feb 1, 2022
@nik9000 nik9000 requested a review from imotov February 1, 2022 16:03
@nik9000
Copy link
Member

nik9000 commented Feb 2, 2022

@elasticmachine, update branch

@nik9000
Copy link
Member

nik9000 commented Feb 2, 2022

@elasticmachine update branch

@mark-vieira mark-vieira added v8.2.0 and removed v8.1.0 labels Feb 2, 2022
@nik9000
Copy link
Member

nik9000 commented Feb 2, 2022

@elasticmachine, test this please

@weizijun weizijun changed the title TSDB: routingPath object check improve TSDB: routingPath object type check improve Feb 8, 2022
@weizijun weizijun changed the title TSDB: routingPath object type check improve TSDB: routingPath object type check improvement Feb 8, 2022
Copy link
Member

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't quite understood all of the implications here. Right now the checks in DocumentMapper match more fields than the IndexRouting code matches which feels safe. It might reject mappings that are technically valid, I mean. But with this change I'm not 100% sure it does that. It probably does. I just haven't quite figured it out yet.

@imotov
Copy link
Contributor

imotov commented Feb 9, 2022

@weizijun, I am not quite sure I understand why you singled out * at the end. We have the same issues with the * in the middle, or beginning:

DELETE test
PUT test
{
  "settings": {
    "index.mode": "time_series",
    "index.routing_path": "di*m"
  },
  "mappings": {
    "properties": {
      "dim": {
        "properties": {
          "dim": {
            "type": "keyword",
            "time_series_dimension": true
          }
        }
      },
      "val": {
        "type": "double"
      }
    }
  }
}

@weizijun
Copy link
Contributor Author

@imotov @nik9000 , the PR is to solve the problem that object type will throw exception. e.g
the data is :

{
  "dim": {
    "bar": "xxx"
  }
}

Now, the index.routingPath configure with dim.* is ok, but dim*, di*, d* will throw exception.
I add a condition, that if the patten end with *, it means the children of object can match the patten. So we can pass it, don't throw exception.
Only the patten exact match the object type will throw exception, like *im, d*m, *m, it exact match the object type, not ok, and throw the exception.

dim*, di*, d* will pass the check, and don't throw exception.

@weizijun
Copy link
Contributor Author

weizijun commented Feb 10, 2022

DELETE test
PUT test
{
  "settings": {
    "index.mode": "time_series",
    "index.routing_path": "di*m"
  },
  "mappings": {
    "properties": {
      "dim": {
        "properties": {
          "dim": {
            "type": "keyword",
            "time_series_dimension": true
          }
        }
      },
      "val": {
        "type": "double"
      }
    }
  }
}

In the case, it will throw exception, and index.routing_path=dim*, di*, d* will also throw exception.

After the PR, index.routing_path=dim*, di*, d* is ok. but index.routing_path=di*m, *im, d*m, *m will also throw exception.

@imotov
Copy link
Contributor

imotov commented Feb 10, 2022

After the PR, index.routing_path=dim*, di*, d* is ok. but index.routing_path=di*m, *im, d*m, *m will also throw exception.

Why do you think it is ok to throw exception in this case and not throw an exception in the other? To me these are the same use case. If we want to solve one of them - let's solve all of them. I don't really see the difference between leading and trailing *. If you think your use case with trailing * is important to fix, there are a lot of also reasonable use-cases to support leading *. I am pretty sure there are some use cases that might make * in the middle useful as well.

@weizijun
Copy link
Contributor Author

@elasticmachine update branch

@imotov
Copy link
Contributor

imotov commented Feb 10, 2022

@elasticmachine test this please

@imotov
Copy link
Contributor

imotov commented Feb 10, 2022

@elasticmachine, test this please

Copy link
Contributor

@imotov imotov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it looks reasonable now. @nik9000 WDYT?

Copy link
Member

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

@elasticmachine test this please

@nik9000
Copy link
Member

nik9000 commented Feb 11, 2022

Ah! It already has.

@nik9000 nik9000 merged commit ab6de24 into elastic:master Feb 11, 2022
@nik9000
Copy link
Member

nik9000 commented Feb 11, 2022

Merged. Thanks @weizijun !

@weizijun
Copy link
Contributor Author

Merged. Thanks @weizijun !

Thanks @imotov @nik9000

@weizijun weizijun deleted the routingPath-object-check-improve branch February 11, 2022 16:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>enhancement external-contributor Pull request authored by a developer outside the Elasticsearch team :StorageEngine/TSDB You know, for Metrics Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants