Skip to content

[DONTREVIEW] Add test for unused subset removal#3802

Closed
rgs1 wants to merge 4 commits intoenvoyproxy:masterfrom
rgs1:subsets-are-not-removed
Closed

[DONTREVIEW] Add test for unused subset removal#3802
rgs1 wants to merge 4 commits intoenvoyproxy:masterfrom
rgs1:subsets-are-not-removed

Conversation

@rgs1
Copy link
Member

@rgs1 rgs1 commented Jul 6, 2018

We are tracking down why our active subsets keep growing without
the unused ones being removed, e.g.:

$ curl -s localhost:9901/stats | grep subset | grep app
cluster.app.lb_subsets_active: 396
cluster.app.lb_subsets_created: 396
cluster.app.lb_subsets_fallback: 1171992
cluster.app.lb_subsets_removed: 0
cluster.app.lb_subsets_selected: 46201647

We have the following subset LB config:

    "lb_subset_config": {
      "fallback_policy": "DEFAULT_SUBSET",
      "default_subset": {
        "default": "true"
      },
      "subset_selectors": [
        {
          "keys": [
            "default"
          ]
        },
        {
          "keys": [
            "version"
          ]
        },
        {
          "keys": [
            "canary"
          ]
        }
      ]
    }

And we never have more than have more than 6 possible subsets at a given
time:

default/true
version/v1
version/v2
version/v3
version/v4
canary/true

Of course, versions keep changing but there are never more than 4 at the same
time. It looks like the old ones are not being removed.

One important note is that endpoints remain constant (e.g.: (ip, port)), but
their metadata changes (e.g.: the version).

Signed-off-by: Raul Gutierrez Segales rgs@pinterest.com

We are tracking down why our active subsets keep growing without
the unused ones being removed, e.g.:

```
$ curl -s localhost:9901/stats | grep subset | grep app
cluster.app.lb_subsets_active: 396
cluster.app.lb_subsets_created: 396
cluster.app.lb_subsets_fallback: 1171992
cluster.app.lb_subsets_removed: 0
cluster.app.lb_subsets_selected: 46201647
```

We have the following subset LB config:

```
    "lb_subset_config": {
      "fallback_policy": "DEFAULT_SUBSET",
      "default_subset": {
        "default": "true"
      },
      "subset_selectors": [
        {
          "keys": [
            "default"
          ]
        },
        {
          "keys": [
            "version"
          ]
        },
        {
          "keys": [
            "canary"
          ]
        }
      ]
    }
```

And we never have more than have more than 6 possible subsets at a given
time:

```
default/true
version/v1
version/v2
version/v3
version/v4
canary/true
```

Of course, versions keep changing but there are never more than 4 at the same
time. It looks like the old ones are not being removed.

One important note is that endpoints remain constant (e.g.: (ip, port)), but
their metadata changes (e.g.: the version).

Signed-off-by: Raul Gutierrez Segales <rgs@pinterest.com>
This was referenced Jul 6, 2018
@rgs1
Copy link
Member Author

rgs1 commented Jul 6, 2018

Here's how the test fail (while I fix format issues, etc), TLDR: active subsets keep
growing and no subset is removed.

Test output:

test/common/upstream/subset_lb_test.cc:578: Failure
Expected equality of these values:
  2U
    Which is: 2
  stats_.lb_subsets_active_.value()
    Which is: 3
test/common/upstream/subset_lb_test.cc:580: Failure
Expected equality of these values:
  1U
    Which is: 1
  stats_.lb_subsets_removed_.value()
    Which is: 0
[  FAILED  ] UpdateOrderings/SubsetLoadBalancerTest.UpdateMetadata/0, where GetParam() = 0 (4 ms)
[ RUN      ] UpdateOrderings/SubsetLoadBalancerTest.UpdateMetadata/1
test/common/upstream/subset_lb_test.cc:578: Failure
Expected equality of these values:
  2U
    Which is: 2
  stats_.lb_subsets_active_.value()
    Which is: 3
test/common/upstream/subset_lb_test.cc:580: Failure
Expected equality of these values:
  1U
    Which is: 1
  stats_.lb_subsets_removed_.value()
    Which is: 0
[  FAILED  ] UpdateOrderings/SubsetLoadBalancerTest.UpdateMetadata/1, where GetParam() = 1 (3 ms)

Raul Gutierrez Segales added 2 commits July 6, 2018 15:29
Signed-off-by: Raul Gutierrez Segales <rgs@pinterest.com>
My previous test was incorrect because the Host objects
are reused.

So I don't think is actually a valid repro for the problem
I am chasing.

Signed-off-by: Raul Gutierrez Segales <rgs@pinterest.com>
@rgs1 rgs1 changed the title Add test for unused subset removal [DONTREVIEW] Add test for unused subset removal Jul 6, 2018
@mattklein123
Copy link
Member

Assigning over to @zuercher to help figure out what's going on here.

It doesn't — unless the endpoint is first removed and then
added again. Not sure if this is by design or an omission.

Signed-off-by: Raul Gutierrez Segales <rgs@pinterest.com>
@zuercher
Copy link
Member

zuercher commented Jul 9, 2018

Closed in favor of #3814

@zuercher zuercher closed this Jul 9, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants