PQ could benefit from event-level compression

As write throughput is bound by disk IO, compressing events during serialization could improve throughput at the cost of CPU (see: [proof of concept](https://github.com/elastic/logstash/pull/17128/files)).

If possible, per-event compression should be delivered _inside_ the scope of existing v2 PQ page format, in which entries contain only `seqnum`+`length`+`N bytes`. To do this, the reader will need to be able to handle compressed or uncompressed bytes without additional context (e.g., by differentiating zlib header from existing CBOR first-bytes).

Because not all users will want to spend CPU for increased throughput, and because of later-mentioned rollback barriers, this feature should first be delivered as opt-in, preferably at a per-pipeline level.

## Compatibility Considerations

Once a queue contains compressed events, it will be unable to be read by a logstash instance that does not support event decompression; this presents an undesired rollback barrier that would prevent a user from rolling back to a last known-working configuration due to an unrelated issue.

Queue compression should be implemented as opt-in until at least three minor versions have shipped with decompression support.

## Design Requirements

1. compression is opt-in for at least 2 minor releases
2. reads compressed events from queue unless explicitly configured otherwise
3. include metrics in `pipeline.${pipeline_id}.queue.compression` namespace:
   | name | definition | expected value range |
   | ---- | ---------- | ------- |
   | `encode.spend.lifetime` | (`encode_time` / `uptime`) | [`0`,(`N_CPUS`)] |
   | `encode.ratio.lifetime` | (compressed_bytes / decompresssed_bytes) | [`0`,`1`] |
   | `decode.spend.lifetime` | (`decode_time` / `uptime`) | [`0`,(`N_CPUS`)] |
   | `decode.ratio.lifetime` | (decompressed_bytes / compresssed_bytes) | [`1`,) |

name	definition	expected value range
`encode.spend.lifetime`	(`encode_time` / `uptime`)	[`0`,(`N_CPUS`)]
`encode.ratio.lifetime`	(compressed_bytes / decompresssed_bytes)	[`0`,`1`]
`decode.spend.lifetime`	(`decode_time` / `uptime`)	[`0`,(`N_CPUS`)]
`decode.ratio.lifetime`	(decompressed_bytes / compresssed_bytes)	[`1`,)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

PQ could benefit from event-level compression #17819

Compatibility Considerations

Design Requirements

Sub-issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

PQ could benefit from event-level compression #17819

Description

Compatibility Considerations

Design Requirements

Sub-issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions