Skip to content

Add docs.total_size_in_bytes to the Index Stats API #97670

@joegallo

Description

@joegallo

Description

DocsStats grew the totalSizeInBytes quite some time ago in #27117. We use it in the rollover API for both the max_size and the max_primary_shard_size conditions.

The Rollover API docs claim that the store.size and store from _cat/indices and _cat/shards is the same value, but that's not actually the case, it's slightly different (I don't know the precise details around the nature of the difference, only that it is different).

Specifically:

GET _cat/indices?s=index&bytes=b&v
health status index    uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   tweets-1 ylWj3PwUT6qAFVo3oA8ofA   1   1          5            0      27730          27730
GET _cat/shards?s=index&bytes=b&v
index    shard prirep state      docs store ip        node
tweets-1 0     p      STARTED       5 27730 127.0.0.1 runTask-0
tweets-1 0     r      UNASSIGNED                      

But in my local copy of Elasticsearch, I've added the DocsStats totat_size_in_bytes to the json output (see diff below) and now I see the following:

GET tweets-1/_stats/docs,store
{
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_all" : {...},
  "indices" : {
    "tweets-1" : {
      "uuid" : "ylWj3PwUT6qAFVo3oA8ofA",
      "health" : "yellow",
      "status" : "open",
      "primaries" : {
        "docs" : {
          "count" : 5,
          "deleted" : 0,
          "total_size_in_bytes" : 27071
        },
        "store" : {
          "size_in_bytes" : 27730,
          "total_data_set_size_in_bytes" : 27730,
          "reserved_in_bytes" : 0
        }
      },
      "total" : {...}
    }
  }
}

Indeed, 27730 and 27071 are quite close to each other, but they aren't precisely the same (emphasizing a second time, I don't know the upper or lower bounds on how much those numbers could differ or why they differ at all).

It seems to me that it was probably an oversight that we don't report the total_size_in_bytes for the DocsStats, and especially since we're using it for rollover decisions I think we should include that value in a non-cat API.


joegallo@galactic:~/Code/elastic/elasticsearch $ git diff
diff --git a/server/src/main/java/org/elasticsearch/index/shard/DocsStats.java b/server/src/main/java/org/elasticsearch/index/shard/DocsStats.java
index 1b4b6405df7..d99361f7e1b 100644
--- a/server/src/main/java/org/elasticsearch/index/shard/DocsStats.java
+++ b/server/src/main/java/org/elasticsearch/index/shard/DocsStats.java
@@ -81,6 +81,7 @@ public class DocsStats implements Writeable, ToXContentFragment {
         builder.startObject(Fields.DOCS);
         builder.field(Fields.COUNT, count);
         builder.field(Fields.DELETED, deleted);
+        builder.field(Fields.TOTAL_SIZE_IN_BYTES, totalSizeInBytes);
         builder.endObject();
         return builder;
     }
@@ -102,5 +103,6 @@ public class DocsStats implements Writeable, ToXContentFragment {
         static final String DOCS = "docs";
         static final String COUNT = "count";
         static final String DELETED = "deleted";
+        static final String TOTAL_SIZE_IN_BYTES = "total_size_in_bytes";
     }
 }

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions