Skip to content

3.41.0 Regression in sources behavior cause all files to be read even with method: timestamp #2029

@chrishoage

Description

@chrishoage

Description

  • Running tasks with sources method timestamp causes all files to be fully read

This purple block in this graph is the data in sources being read
The yellow/green block in the graph is the actual task being run

Image

I noticed this behavior when upgrading to 3.41.0. Reverting to 3.40.1 resolves this issue completely.

Even with method: timestamp in the task definition it still reads all files.

If I had to guess it was regressed by #1872

It appears to set up both checkers regardless of method

task/variables.go

Lines 131 to 147 in f5121de

if len(origTask.Sources) > 0 {
timestampChecker := fingerprint.NewTimestampChecker(e.TempDir.Fingerprint, e.Dry)
checksumChecker := fingerprint.NewChecksumChecker(e.TempDir.Fingerprint, e.Dry)
for _, checker := range []fingerprint.SourcesCheckable{timestampChecker, checksumChecker} {
value, err := checker.Value(&new)
if err != nil {
return nil, err
}
vars.Set(strings.ToUpper(checker.Kind()), ast.Var{Live: value})
}
// Adding new variables, requires us to refresh the templaters
// cache of the the values manually
cache.ResetCache()
}

which ultimately calls

func (c *ChecksumChecker) checksum(t *ast.Task) (string, error) {
sources, err := Globs(t.Dir, t.Sources)
if err != nil {
return "", err
}
h := xxh3.New()
buf := make([]byte, 128*1024)
for _, f := range sources {
// also sum the filename, so checksum changes for renaming a file
if _, err := io.CopyBuffer(h, strings.NewReader(filepath.Base(f)), buf); err != nil {
return "", err
}
f, err := os.Open(f)
if err != nil {
return "", err
}
if _, err = io.CopyBuffer(h, f, buf); err != nil {
return "", err
}
f.Close()
}
hash := h.Sum128()
return fmt.Sprintf("%x%x", hash.Hi, hash.Lo), nil
}

I believe this behavior to be a bug as when declaring a method: timestamp that seems to me that the user expelictly does not want an expensive checksum done

To fix this I suppose the first referenced code block needs to check the method of the task and use this to set up the correct checker.

Version

3.41.0

Operating system

Alpine Linux (task is runing in a docker container)

Experiments Enabled

No response

Example Taskfile

version: "3"

  cache_sync:
    method: timestamp
    sources:
      - "{{ .CACHE_PATH }}/media/**/*"
    cmds:
      - |
        docker run \
          -v "$CACHE_PATH/media:/src" \
          -v "$BACKING_DATA_PATH/media:/dest" \
          -e "PUID=${PUID}" \
          -e "PGID=${PGID}" \
          --rm \
          --pull=never \
          --name=cache_sync \
          rsync

Metadata

Metadata

Assignees

No one assigned

    Labels

    area: fingerprintingChanges related to checksums and caching.area: variablesChanges related to variables.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions