-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fix] - Propagate Async File Handling Errors #3403
base: main
Are you sure you want to change the base?
Conversation
8268de0
to
264298f
Compare
// It takes an io.Reader and checks if it supports seeking. | ||
// If the reader supports seeking, it is stored in the seeker field. | ||
func NewBufferedReaderSeeker(r io.Reader) *BufferedReadSeeker { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename.
@@ -82,9 +82,6 @@ func NewBufferedReaderSeeker(r io.Reader) *BufferedReadSeeker { | |||
) | |||
|
|||
seeker = asSeeker(r) | |||
if seeker == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was removed to avoid checking out a buffer eagerly. We already lazily check out a buffer when needed.
🐇 🕳️ |
} | ||
|
||
go func() { | ||
ctx, cancel := logContext.WithTimeout(ctx, maxTimeout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you remove this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought I left a comment, but it was probably on the closed PR. This was removed because the context timeout is set at the call site, so setting it here has no effect since it inherits the context from HandleFile
.
here
pkg/handlers/archive.go
Outdated
err = fmt.Errorf("panic occurred: %v", r) | ||
panicErr = fmt.Errorf("panic occurred: %v", r) | ||
} | ||
ctx.Logger().Error(panicErr, "Panic occurred when attempting to open archive") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this error in particular loggable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It shouldn't be. Logging should be handled while consuming from the dataOrErrChan
at the call-site. Will remove thanks.
pkg/handlers/handlers.go
Outdated
var ( | ||
ErrEmptyReader = errors.New("reader is empty") | ||
|
||
// ErrCriticalProcessing indicates a critical error that should halt processing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Critical" is one of those words that unfortunately doesn't really convey any information to someone who doesn't already know the domain. Should these errors halt processing of the source? The file? Something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, I agree. I'll update it. 👍
pkg/handlers/handlers.go
Outdated
|
||
// If an error occurs during MIME type detection, it is important we close the BufferedReaderSeeker | ||
// to release any resources it holds (checked out buffers or temp file). | ||
var err error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
err
gets shadowed later in this function. If that's on purpose, I find it very confusing, and would understand much better if you used a different variable name.
// handler to manage file extraction or processing. | ||
// | ||
// The function will return nil (success) in the following cases: | ||
// - If the reader is empty (ErrEmptyReader) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will it return nil
? Or ErrEmptyReader
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should return nil
if the error from newFileReader
is ErrEmptyReader
. Do think think it should return ErrEmptyReader
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, that makes sense, I was just confused because within the context the comment it wasn't clear that an empty reader was signaled (elsewhere) by ErrEmptyReader
} | ||
|
||
go func() { | ||
ctx, cancel := logContext.WithTimeout(ctx, maxTimeout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same question as above: why the removal?
} | ||
}() | ||
|
||
var rpm *rpmutils.Rpm | ||
rpm, err = rpmutils.ReadRpm(input) | ||
if err != nil { | ||
ctx.Logger().Error(err, "error reading RPM") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these not real errors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand why you thought I removed them. 😢 In the defer func()
, we call .measureLatencyAndHandleErrors
, which uses the shadowed err
. This eventually gets logged when we consume from dataOrErrChan
. I can avoid shadowing by explicitly sending err
to the channel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see what looks like three separable PRs here:
- Moving around some metric observations
- Switching to
DataOrErr
- Surfacing a new archive extraction error
Could they be done separately?
// handler to manage file extraction or processing. | ||
// | ||
// The function will return nil (success) in the following cases: | ||
// - If the reader is empty (ErrEmptyReader) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, that makes sense, I was just confused because within the context the comment it wasn't clear that an empty reader was signaled (elsewhere) by ErrEmptyReader
Yep yep, will do. 👍 |
// DataOrErr represents a result that can either contain data or an error. | ||
// The Data field holds the byte slice of data, and the Err field holds any error that occurred. | ||
// This structure is used to handle asynchronous file processing where each chunk of data | ||
// or potential error needs to be communicated back to the caller. It allows for | ||
// efficient streaming of file contents while also providing a way to propagate errors | ||
// that may occur during the file handling process. | ||
type DataOrErr struct { | ||
Data []byte | ||
Err error | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🦀 🦀 🦀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, if we do go down this route in a more organized way it'd be good to look at prior art. Rust's version of this is probably the most most usable by people other than functional programming weirdos. Kotlin has a third-party version that references the concept's monadic roots and references some prior art like the F# version and the inscrutable Haskell dorks.
Description:
This PR updates error reporting during file processing. Previously, each Handler spawned a goroutine to handle file processing and returned a channel for the caller to collect results. However, this method only returned processed data and failed to propagate errors due to the file processing happening in separate goroutines. As a result, errors were simply logged, and critical errors were returned within the goroutine, leaving the calling function unaware of any issues. This became especially problematic when
HandleFile
was called viahandleBinary
, as the reader passed toHandleFile
is a pipe. If the caller isn’t informed of an error, it can't properly consume the reader, potentially leaving it open and exhausting resources.To resolve this, the PR introduces the
DataOrErr
struct, allowing handlers to return both critical and non-critical errors to the caller. This ensures better resource management and prevents resource leaks.Checklist:
make test-community
)?make lint
this requires golangci-lint)?