-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Getting 408 on High Load save #833
Comments
A few questions to help us work out what's happening:
|
@daneshka Please provide the |
@laljikanjareeya @AVaksman
Here you can see the rate of this error during the last 12 hours: |
@daneshka, have you tried retrying on error? If you retry after the 408 is it successful? |
@crwilcox |
@daneshka it would be interesting to understand if this is a transient error. I imagine it is and we can likely harden the client around this. |
We didn't see the issue after adding our own retry on top of the library. |
@crwilcox how do you think we should proceed here? Is there anything we can do? |
Our uploads all use our streaming API underneath the surface, so However, I don't think there's a reason Although, 408 error codes are not currently being retried. We have a small whitelist of retried error codes: 429, 500, 502, and 503. The benefits of making the change to |
Just got this error today when writing images via |
Ran into the same issue today. const fileStream = myFile.createWriteStream({
// Support for HTTP requests made with `Accept-Encoding: gzip`
metadata: {
contentType: 'application/json'
// contentEncoding: 'gzip'
},
gzip: true,
resumable: false,
validation: 'md5'
// validation: false
}); to create the stream, and also catch all errors of the stream, but we get an uncatchable error when the response is 408:
As far as I read correctly, adding 408 to retryable errors would solve the issue? I've added the shouldRetryRequests function on our end to test this, and it looks good so far. Wouldn't that be sufficient? @stephenplusplus |
Regarding my uncatchable error, could it be that createWriteStream does not forward errors correclty? createWriteStream is using dupleify behind the scene, but there is no on('error') method registered on the "fileWriteStream". Not sure, just an idea I thought I write down. |
@simllll is there any chance you could provide a runnable example and any more details about what is happening when the error occurs? |
That's the struggle with this issue, it only happens on production, and I believe it is related either to high load (like the OP) or due to too many parallel transactions. I couldn't find a way to reproduce the original timeout issue, but this one is gone since adding the 408 to retryable errors. throw new Error('TEST'); I was not able to catch this error so far, probably because it's not a stream error itself, it's a thrown error (like the one where JSON cannot be parsed I guess) |
Indeed, if I wrap the handleResp in a try / catch block and catch the error: try {
...
} catch (err) {
callback(err);
} I'm able to catch the error :) See here for details: What I still don't understand is, why this solves the JSON parse exception though (It does indeed; I already have reproduced it successfully on production). The JSON parse error is not thrown directly, it's just passed back as "err" parameter within the parsedHttpRespBody object. I can't see what I'm missing here. update 1
update 2
|
Getting this "Error 408" quite a lot lately. It's making storage very unreliable, it messes up out process a lot, even our load is minimal for allmighty cloud infra (about 1K requests per month) and filesize is ~ 400KB |
Same issue here. We're seeing an increase in 408 errors lately. It would be great if the retry PR googleapis/nodejs-common#562 could be merged. |
@pebo that PR wasn't moving too quickly, so I've thrown together a new one: googleapis/nodejs-common#578. Hopefully, we can merge and release soon. |
@stephenplusplus thanks for the PR, but after checking our latest error latest logs this PR (which adds retries for 408) might make GCS behave even worse for us. We're occasionally seeing very long upload times (>6minutes!) for tiny 90byte files before GCS throws the exception mentioned in this issue. The code is running in Cloud Run on the node:12-slim base image and the container had successfully written to GCS a few seconds before the request that caused 408 and successfully handles writes triggered by other invokations just after the failing operation. Using
Can we reduce the timeouts for writes to GCS (then retries makes sense)? |
We now retry 408 errors and allow |
I'm on "@google-cloud/storage": "^7.1.0",
In logs I see some html error. Why are they HTML?! Isn't it a bug by itself (returning errors in HTML on API requests)?
|
Environment details
@google-cloud/storage
version:Steps to reproduce
The error only happens on high load save on the same Bucket.
We use GUID in the URI.
It sends HTML in the error message!
The main message in the HTML is:
Thanks!
The text was updated successfully, but these errors were encountered: