Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,12 @@ private StandardBlobTypes() {}
*/
public static final String APACHE_DATASKETCHES_THETA_V1 = "apache-datasketches-theta-v1";

/**
* A serialized form of a KLL sketch, a very compact quantiles sketch, produced by the <a
* href="https://datasketches.apache.org/">Apache DataSketches</a> library
*/
public static final String APACHE_DATASKETCHES_KLL_SKETCH = "apache-datasketches-kll-v1";

/** A serialized deletion vector according to the Iceberg spec */
public static final String DV_V1 = "deletion-vector-v1";
}
9 changes: 9 additions & 0 deletions format/puffin-spec.md
Original file line number Diff line number Diff line change
Expand Up @@ -181,6 +181,15 @@ for Puffin v1.
[roaring-bitmap-portable-serialization]: https://github.com/RoaringBitmap/RoaringFormatSpec?tab=readme-ov-file#extension-for-64-bit-implementations
[roaring-bitmap-general-layout]: https://github.com/RoaringBitmap/RoaringFormatSpec?tab=readme-ov-file#general-layout

#### `apache-datasketches-kll-v1` blob type

A serialized form of a KLL sketch, a very compact quantiles sketch, produced by the
[Apache DataSketches](https://datasketches.apache.org/) library.
KLL quantiles sketch is a mergeable streaming algorithm to estimate
the distribution of values, and approximately answer queries about the rank of a value,
probability mass function of the distribution (PMF) or histogram,
cumulative distribution function (CDF), and quantiles (median, min, max, 95th percentile and such)

### Compression codecs

The data can also be uncompressed. If it is compressed the codec should be one of
Expand Down