Skip to content

Add SFM Sketch for private cardinality estimation#65

Merged
NikhilCollooru merged 1 commit intoprestodb:masterfrom
jonhehir:sfm-sketch
Oct 21, 2023
Merged

Add SFM Sketch for private cardinality estimation#65
NikhilCollooru merged 1 commit intoprestodb:masterfrom
jonhehir:sfm-sketch

Conversation

@jonhehir
Copy link

The SFM sketch is a mergeable, differentially private sketch for distinct counting, introduced in
https://arxiv.org/abs/2302.02056. Conceptually, it is similar to HyperLogLog, except that noise may be intentionally added to the sketch to preserve the privacy of its entries. After noise has been added, the sketch becomes private and immutable. However, private sketches may be merged with other private sketches to obtain a sketch of their union.

This sketch may be used to obtain private estimates of distinct counts with low memory footprint in a distributed setting.

Some minor refactoring of the PrivateLpcaSketch is included in this commit. (The PrivateLpcaSketch can be used to convert an existing HyperLogLog to a different, less flexible privacy- preserving sketch. While similar in spirit, the use cases for these two sketches are subtly different.)

@jonhehir jonhehir requested a review from mlyublena October 11, 2023 22:20
@jonhehir
Copy link
Author

cc @DanielTing @gcormode

The SFM sketch is a mergeable, differentially private sketch
for distinct counting, introduced in
https://arxiv.org/abs/2302.02056. Conceptually, it is similar
to HyperLogLog, except that noise may be intentionally added to
the sketch to preserve the privacy of its entries. After noise
has been added, the sketch becomes private and immutable.
However, private sketches may be merged with other private
sketches to obtain a sketch of their union.

This sketch may be used to obtain private estimates of distinct
counts with low memory footprint in a distributed setting.

Some minor refactoring of the PrivateLpcaSketch is included in
this commit. (The PrivateLpcaSketch can be used to convert an
existing HyperLogLog to a different, less flexible privacy-
preserving sketch. While similar in spirit, the use cases for
these two sketches are subtly different.)
Copy link

@mlyublena mlyublena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM from Presto's side

@gcormode
Copy link

Looks good to me too

@jonhehir jonhehir requested review from a team and presto-oss and removed request for a team October 19, 2023 18:58
@NikhilCollooru NikhilCollooru merged commit 277184d into prestodb:master Oct 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants