sorted cardinality results don't include the largest bucket

**Elasticsearch version** (`bin/elasticsearch --version`): 8.0.0, 7.12.0, 7.11.0

**Plugins installed**: [] none, default distribution

**JVM version** (`java -version`): 	built-in JDK

**OS version** (`uname -a` if on a Unix-like system): all (this is my current master source running but this impacts 7.x and 7.11 branches as well)
```
  "name" : "LEEDR-XPS",
  "cluster_name" : "es-test-cluster",
  "cluster_uuid" : "s-GANDlNSZ2nNdr00SQw3g",
  "version" : {
    "number" : "8.0.0-SNAPSHOT",
    "build_flavor" : "oss",
    "build_type" : "zip",
    "build_hash" : "3454a094f73e7696446dbd2c0525041293dd4460",
    "build_date" : "2021-01-19T19:31:16.897887417Z",
    "build_snapshot" : true,
    "lucene_version" : "8.8.0",
    "minimum_wire_compatibility_version" : "7.12.0",
    "minimum_index_compatibility_version" : "7.0.0"
  },
  "tagline" : "You Know, for Search"
}
```

**Description of the problem including expected versus actual behavior**: A cardinality agg with split by terms is no longer returning the term with the largest result count.  Results vary based on the "size".

Almost 3 years ago we automated the Shakespeare Kibana getting started tutorial https://www.elastic.co/guide/en/kibana/6.8/tutorial-load-dataset.html
The test has been passing with the same expected results until about Oct 29, 2020 when the results returned by the aggregation changed.  Unfortunately the test was skipped to allow Kibana to take the new Elasticsearch snapshot and wasn't investigated until now.

**Steps to reproduce**:

Please include a *minimal* but *complete* recreation of the problem,
including (e.g.) index creation, mappings, settings, query etc.  The easier
you make for us to reproduce it, the more likely that somebody will take the
time to look at it.

 1. download this data https://download.elastic.co/demos/kibana/gettingstarted/shakespeare_6.0.json
 2. create this mapping;
 ```
PUT /shakespeare
{
  "mappings": {
    "properties": {
      "speaker": {
        "type": "keyword"
      },
      "play_name": {
        "type": "keyword"
      },
      "line_id": {
        "type": "integer"
      },
      "speech_number": {
        "type": "integer"
      }
    }
  }
}
```

 3. Load the data;
 `curl -H 'Content-Type: application/x-ndjson' -XPOST 'localhost:9200/shakespeare/doc/_bulk?pretty' --data-binary @shakespeare_6.0.json`
4. count the docs to make sure we have the same data `curl -XGET 'localhost:9220/shakespeare/_count'` "count":111396
5. run the same query as the Kibana visualization test;
```
GET /shakespeare/_search
{
  "aggs": {
    "2": {
      "terms": {
        "field": "play_name",
        "order": {
          "1": "desc"
        },
        "size": 5
      },
      "aggs": {
        "1": {
          "cardinality": {
            "field": "speaker"
          }
        }
      }
    }
  },
  "size": 0,
  "fields": [],
  "script_fields": {},
  "stored_fields": [
    "*"
  ],
  "_source": {
    "excludes": []
  },
  "query": {
    "bool": {
      "must": [],
      "filter": [
        {
          "match_all": {}
        }
      ],
      "should": [],
      "must_not": []
    }
  }
}
```

The results I get on latest master are incorrect;
```
{
  "took" : 12,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 10000,
      "relation" : "gte"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "2" : {
      "doc_count_error_upper_bound" : -1,
      "sum_other_doc_count" : 94454,
      "buckets" : [
        {
          "key" : "Henry VI Part 2",
          "doc_count" : 3334,
          "1" : {
            "value" : 65
          }
        },
        {
          "key" : "Coriolanus",
          "doc_count" : 3992,
          "1" : {
            "value" : 62
          }
        },
        {
          "key" : "Antony and Cleopatra",
          "doc_count" : 3862,
          "1" : {
            "value" : 55
          }
        },
        {
          "key" : "Henry VI Part 1",
          "doc_count" : 2983,
          "1" : {
            "value" : 53
          }
        },
        {
          "key" : "Julius Caesar",
          "doc_count" : 2771,
          "1" : {
            "value" : 51
          }
        }
      ]
    }
  }
}
```
If we increase the terms agg size to 12 we get results that show the largest bucket value of 71 which is what the Kibana test has expected since it was written almost 3 years ago and is what 7.10 shows;
```
      "buckets" : [
        {
          "key" : "Richard III",
          "doc_count" : 3941,
          "1" : {
            "value" : 71
          }
        },
```


**Provide logs (if relevant)**:




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

sorted cardinality results don't include the largest bucket #67782

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

sorted cardinality results don't include the largest bucket #67782

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions