Skip to content
disheng222 edited this page Sep 2, 2016 · 6 revisions

Today’s HPC applications are producing extremely large amounts of data, thus it is necessary to use an efficient compression before storing them to parallel file systems.

We developed the error-bounded HPC data compressor, by proposing a novel HPC data compression method that works very effectively on compressing large-scale HPC data sets.

The key features of SZ are listed below.

  • Compression: Input: a data set (or a floating-point array) with error-bound requirements; Output: the compressed byte stream

    Decompression: input: the compressed byte stream ; Output: the original data set with the compression error of each data point being within a pre-specified error bound ∆.

  • SZ supports C and Fortran.

  • SZ supports two types of error bounds.

The users can set either absolute error bound or relative error bound, or a combination of the two bounds (with operator AND or OR).

The absolute error bound (denoted δ) is a constant, such as 1E-6. That is, the decompressed data Di′ must be in the range [Di − δ,Di + δ], where Di′ is referred as the decompressed value and Di is the original data value. As for the relative error bound, it is a linear function of the global data value range size, i.e., ∆=λr, where λ(∈(0,1)) and r refer to error bound ratio and range size respectively. For example, given a set of data, the range size r is equal to max (Di )− min (Di ), and the error bound can be written as λ( max (Di )− min (Di )). The relative error bound allows to make sure that the compression error for any data point must be no greater than λ×100 percentage of the global data value range size.

  • Detailed usage and examples can be found under the directories doc/user-guide.pdf and example/ respectively, in the package.
Clone this wiki locally