compress/lzw: compress/decompress corrupts data #11142

dvyukov · 2015-06-10T15:12:55Z

The following program fails with the panic:

package main

import (
    "bytes"
    "compress/lzw"
    "fmt"
    "io/ioutil"
)

func main() {
    uncomp := []byte("a")
    buf := new(bytes.Buffer)
    w := lzw.NewWriter(buf, lzw.LSB, 2)
    _, err := w.Write(uncomp)
    if err != nil {
        panic(err)
    }
    if err := w.Close(); err != nil {
        panic(err)
    }
    r1 := lzw.NewReader(buf, lzw.LSB, 2)
    uncomp1, err := ioutil.ReadAll(r1)
    if err != nil {
        panic(err)
    }
    if !bytes.Equal(uncomp, uncomp1) {
        fmt.Printf("data0: %q\n", uncomp)
        fmt.Printf("data0: %q\n", uncomp1)
        panic("data differs")
    }
}

data0: "a"
data0: "\x01"
panic: data differs

go version devel +b0532a9 Mon Jun 8 05:13:15 2015 +0000 linux/amd64

The text was updated successfully, but these errors were encountered:

dvyukov · 2015-06-10T16:35:09Z

Is it because of width?
Experiments show that width is the number of bits encoded from every byte.

dsnet · 2015-06-16T16:47:42Z

I don't know too much about lzw, but comments say that the litWidth value controls the "number of bits to use for literal codes". Thus, if the value is set to 2, doesn't that mean you can only encode the literals 0x00, 0x01, 0x02, and 0x03?

In fact, this seems to be what's happening since the above code works when uncomp is set to \x00, \x01, \x02, or \x03. It also seems that the incorrect output value is the input value modulo 4.

If the encoder/decoder is working properly, maybe Write should output an error if the user tries to encode data with literals that are too large? In the horrendous off-chance that other formats depend on this degenerate behavior, then the library should at least document it?

dvyukov · 2015-06-17T11:55:40Z

@dsnet Yes, this is my current understanding that there is no bug in the code.
I don't know whether it worth a runtime check or not, maybe it is meant to be obvious for anybody using the package. However, the docs are quite cryptic ("number of bits to use for literal codes"). When I read it first time, I interpreted it as some parameter of compression algorithm.

gopherbot · 2015-06-17T13:00:14Z

CL https://golang.org/cl/11063 mentions this issue.

gopherbot · 2015-06-18T05:00:14Z

CL https://golang.org/cl/11227 mentions this issue.

Fixes #11142. Change-Id: Id772c4364c47776d6afe86b0939b9c6281e85edc Reviewed-on: https://go-review.googlesource.com/11227 Reviewed-by: Russ Cox <[email protected]>

dvyukov assigned nigeltao Jun 10, 2015

ianlancetaylor added this to the Go1.5Maybe milestone Jun 10, 2015

nigeltao closed this as completed in 2a5745d Jun 18, 2015

mikioh modified the milestones: Go1.5, Go1.5Maybe Jun 18, 2015

golang locked and limited conversation to collaborators Jun 25, 2016

gopherbot added the FrozenDueToAge label Jun 25, 2016

rsc unassigned nigeltao Jun 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

compress/lzw: compress/decompress corrupts data #11142

compress/lzw: compress/decompress corrupts data #11142

dvyukov commented Jun 10, 2015

dvyukov commented Jun 10, 2015

dsnet commented Jun 16, 2015

dvyukov commented Jun 17, 2015

gopherbot commented Jun 17, 2015

gopherbot commented Jun 18, 2015

compress/lzw: compress/decompress corrupts data #11142

compress/lzw: compress/decompress corrupts data #11142

Comments

dvyukov commented Jun 10, 2015

dvyukov commented Jun 10, 2015

dsnet commented Jun 16, 2015

dvyukov commented Jun 17, 2015

gopherbot commented Jun 17, 2015

gopherbot commented Jun 18, 2015