Skip to content
This repository was archived by the owner on Feb 6, 2024. It is now read-only.
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions landing-page/content/common/puffin-spec.md
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,24 @@ The blob metadata for this blob may include following properties:

- `ndv`: estimate of number of distinct values, derived from the sketch.

#### `column-statistics-obj` blob type

A serialized form of Hive ColumnStatsObject.

The columnStatsObject supports Histograms, NDV, Min and Max values, Number of nulls, Number of trues, column name, type.
A full list of supported statistics is listed in the table here:
[ColumnStatistics](https://cwiki.apache.org/confluence/display/Hive/StatsDev#StatsDev-ColumnStatistics)

#### `apache-datasketches-KLL-sketch` blob type

A serialized form of a "compact" KLL-sketch produced by the [Apache
DataSketches](https://datasketches.apache.org/) library.
Apache-datasketches-KLL-sketch is an implementation of a very compact quantiles
sketch with lazy compaction scheme and nearly optimal accuracy per bit.

Histograms are derived from this sketch.


### Compression codecs

The data can also be uncompressed. If it is compressed the codec should be one of
Expand Down