Skip to content

Question on best practice for stream-based decompression #3113

@dentiny

Description

@dentiny

This issue is to ask question about the best practice for stream-based decompression.

Context:
On compression side, I use zstd ZSTD_compressCCtx block-based compression.
On decompression side, I read content from cloud in chunks with zero-copy file handle: we could get the pointer for content in each chunk, rather than memcopy them into a large chunk then decompress.

I want to know if zstd currently supports stream-based decompression, which could better utilize the zero-copy feature?
The intention is to avoid copying the content of these chunks into a big block then decompress.

I tried to use buffer-less decompression API as the example code, but it doesn't show difference with copy content + decompress.

I also tried the stable output buffer option checked in this PR, still cannot see much difference, so I'm suspect I am using it in the wrong way.

My benchmark reads compressed content from cloud, and use block-based or stream-based decompress to get the final content. The un-compressed size is seperately 10MB, 50MB, and 500MB.
What confuses me is it seems "concatenating chunks together then decompress" and "decompress them in chunks for multiple" times doesn't show much difference, sometimes even slower.

I would be really grateful if zstd experts like you could give me some instructions on this issue. Thank you!

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions