ORC-XXX: Support orc.compression.zstd.workers #1756

dongjoon-hyun · 2024-01-17T00:13:37Z

What changes were proposed in this pull request?

Why are the changes needed?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

dongjoon-hyun · 2024-01-17T00:14:00Z

Evaluating if this has any benefits in ORC.

cxzl25 · 2024-01-17T04:02:11Z

Thanks @dongjoon-hyun for doing this, this is also what I want to introduce this configuration after zstd-jni merge, like Spark and Parquet also have similar configurations.

dongjoon-hyun · 2024-01-17T04:44:50Z

Ya, indeed.

BTW, it seems that there is no perf gain with this so far. Interesting.

cxzl25 · 2024-01-19T06:46:56Z

it seems that there is no perf gain with this so far

Based on the product environment verification of this PR, I tested orc.compression.zstd.workers 0, 6, 15, and 16, and there seems to be no difference.

Although Paruqet also provides options for the number of zstd workers.

https://github.com/apache/parquet-mr/blob/c82d5b471a558124b03e67759038661a046f5938/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/codec/ZstandardCodec.java#L53-L54

https://facebook.github.io/zstd/zstd_manual.html

    ZSTD_c_nbWorkers=400,    /* Select how many threads will be spawned to compress in parallel.
                              * When nbWorkers >= 1, triggers asynchronous mode when invoking ZSTD_compressStream*() :
                              * ZSTD_compressStream*() consumes input and flush output if possible, but immediately gives back control to caller,
                              * while compression is performed in parallel, within worker thread(s).
                              * (note : a strong exception to this rule is when first invocation of ZSTD_compressStream2() sets ZSTD_e_end :
                              *  in which case, ZSTD_compressStream2() delegates to ZSTD_compress2(), which is always a blocking call).
                              * More workers improve speed, but also increase memory usage.
                              * Default value is `0`, aka "single-threaded mode" : no worker is spawned,
                              * compression is performed inside Caller's thread, and all invocations are blocking */

dongjoon-hyun · 2024-01-19T19:40:15Z

Thank you for double-check. Ya, it seems that our implementation has some limitations or bug.

Apache Spark also has the ZStandardCodec implementation based on this zstd-jni and it shows 30% or 40% improvement in the micro-bencharmk.

https://github.com/apache/spark/blob/39f8e1a5953b5897f893151d24dc585a80c0c8a0/core/benchmarks/ZStandardBenchmark-results.txt#L27-L47

I'm still digging this because I believe this should be a part of Apache ORC 2.0.0

ORC-XXX: Support orc.compression.zstd.workers

1775f51

github-actions bot added the JAVA label Jan 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ORC-XXX: Support orc.compression.zstd.workers #1756

ORC-XXX: Support orc.compression.zstd.workers #1756

dongjoon-hyun commented Jan 17, 2024

dongjoon-hyun commented Jan 17, 2024

cxzl25 commented Jan 17, 2024

dongjoon-hyun commented Jan 17, 2024

cxzl25 commented Jan 19, 2024 •

edited

Loading

dongjoon-hyun commented Jan 19, 2024

ORC-XXX: Support orc.compression.zstd.workers #1756

Are you sure you want to change the base?

ORC-XXX: Support orc.compression.zstd.workers #1756

Conversation

dongjoon-hyun commented Jan 17, 2024

What changes were proposed in this pull request?

Why are the changes needed?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

dongjoon-hyun commented Jan 17, 2024

cxzl25 commented Jan 17, 2024

dongjoon-hyun commented Jan 17, 2024

cxzl25 commented Jan 19, 2024 • edited Loading

dongjoon-hyun commented Jan 19, 2024

cxzl25 commented Jan 19, 2024 •

edited

Loading