-
-
Notifications
You must be signed in to change notification settings - Fork 728
Description
Description
- Running tasks with sources method timestamp causes all files to be fully read
This purple block in this graph is the data in sources
being read
The yellow/green block in the graph is the actual task being run
I noticed this behavior when upgrading to 3.41.0. Reverting to 3.40.1 resolves this issue completely.
Even with method: timestamp
in the task definition it still reads all files.
If I had to guess it was regressed by #1872
It appears to set up both checkers regardless of method
Lines 131 to 147 in f5121de
if len(origTask.Sources) > 0 { | |
timestampChecker := fingerprint.NewTimestampChecker(e.TempDir.Fingerprint, e.Dry) | |
checksumChecker := fingerprint.NewChecksumChecker(e.TempDir.Fingerprint, e.Dry) | |
for _, checker := range []fingerprint.SourcesCheckable{timestampChecker, checksumChecker} { | |
value, err := checker.Value(&new) | |
if err != nil { | |
return nil, err | |
} | |
vars.Set(strings.ToUpper(checker.Kind()), ast.Var{Live: value}) | |
} | |
// Adding new variables, requires us to refresh the templaters | |
// cache of the the values manually | |
cache.ResetCache() | |
} | |
which ultimately calls
task/internal/fingerprint/sources_checksum.go
Lines 90 to 115 in f5121de
func (c *ChecksumChecker) checksum(t *ast.Task) (string, error) { | |
sources, err := Globs(t.Dir, t.Sources) | |
if err != nil { | |
return "", err | |
} | |
h := xxh3.New() | |
buf := make([]byte, 128*1024) | |
for _, f := range sources { | |
// also sum the filename, so checksum changes for renaming a file | |
if _, err := io.CopyBuffer(h, strings.NewReader(filepath.Base(f)), buf); err != nil { | |
return "", err | |
} | |
f, err := os.Open(f) | |
if err != nil { | |
return "", err | |
} | |
if _, err = io.CopyBuffer(h, f, buf); err != nil { | |
return "", err | |
} | |
f.Close() | |
} | |
hash := h.Sum128() | |
return fmt.Sprintf("%x%x", hash.Hi, hash.Lo), nil | |
} |
I believe this behavior to be a bug as when declaring a method: timestamp
that seems to me that the user expelictly does not want an expensive checksum done
To fix this I suppose the first referenced code block needs to check the method
of the task and use this to set up the correct checker.
Version
3.41.0
Operating system
Alpine Linux (task is runing in a docker container)
Experiments Enabled
No response
Example Taskfile
version: "3"
cache_sync:
method: timestamp
sources:
- "{{ .CACHE_PATH }}/media/**/*"
cmds:
- |
docker run \
-v "$CACHE_PATH/media:/src" \
-v "$BACKING_DATA_PATH/media:/dest" \
-e "PUID=${PUID}" \
-e "PGID=${PGID}" \
--rm \
--pull=never \
--name=cache_sync \
rsync