-
Notifications
You must be signed in to change notification settings - Fork 393
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛 [firestore-bigquery-export] Server returned 502 when mirroring data to BQ #2133
Comments
Hi, thanks for raising this. I'm looking into it. It may an intermittent error on the BQ side, which would be difficult to resolve from extensions. I'll keep you updated with my progress |
So the log is occurring here When a cloud task fails to enqueue, the function makes this log. Then if more than 10 seconds have passed since the original firestore write event, it returns. Otherwise it will throw the error. I'll have to look into the reasons for implementing it like this. Maybe we should be retrying the enqueue at this point instead, to prevent data loss. |
appreciate your time looking into this. the link you provided speaks of a '1mb limit' payload limit for cloud tasks. I can confirm our data object is no where near 1mb in size. |
Yeah the code comment about payload size isn't relevant here, but is another existing issue with this extension for some users with large payload sizes. The relevant link in the link is the catch block at line 143, if you're interested. I'll see if we can get this issue prioritised. |
Speaking of dataloss, I found another huge issue with the cloud tasks and data loss and a way to replicate: If there's any pending tasks on the cloud task queue when an upgrade or configuration change is made to the extension, the cloud tasks are cleared. As a way to test this, set up the extension, and navigate to the extension's task list
|
thanks for bringing that up @Nushio I think same might happen even if they're not paused as well, if the queues are backed up enough. I'll open a new issue to track this, and discuss it with the extensions team. I don't have an immediate viable solution for that in mind either, as the task queue name is pretty much fixed across reconfigurations (i think). |
Yes, our queue wasn't paused when I ran an update and lost data as a result. (It's not lost, it's still on firestore, but now I need to ping each document to trigger a sync) Pausing was a simple way to replicate the issue. |
@cabljac I was wondering if there is an update on this? this weekend we had a super busy weekend (business wise) and it turned out several Firestore document updates did not land in BQ. we've added alerting when errors happen in the extension, however the errors do not include the document ID in question, so we only know the sync mechanism failed, but nog for which document (we have to look that up manually). our alerting is triggered on; ![]() as you can see the errors occured more or less at the same time, with one outlier. I'm aware one error type is not in scope of this specific issue. however I wanted it share it with you anyway as the general issue we're facing is; Error 1; (perhaps slightly off (issue) topic)
Error 2; (same as original issue topic)
|
a kind reminder 🙏🏾 do you have an update on a potential fix? |
Just looking to get prioritisation on this, which we may now have |
The issue with my fix idea is that it's not really going to fix it - it will reduce the frequency of this error but we're still going to get data loss... |
Out of interest @boywijnmaalen do you have EventArc events enabled for the extension? Trying to rule things out |
Unsure what you mean if we have EventArc enabled for the extension? I just had a look at our firestore>bigquery extension properties, but from what I can tell there is an option to enable events if you want use custom event handlers. however we do not make use of custom event handlers. as far as I can tell only the BigQuery API and Cloud Tasks API are required for this extension. I'm not aware of any specific event related APIs that need to be enabled as well. |
apologies for chasing you once again, is there an update/ETA for this issue? |
Hey @boywijnmaalen - We've released 0.1.56 which should mitigate this issue, by minimising the use of cloud tasks. I think there may be a caching issue with the marketplace currently, and 0.1.56 isn't showing up. You can update it via this link though - https://console.firebase.google.com/u/0/project/_/extensions/install?ref=firebase%[email protected] |
Closing this for now, feel free to reopen if the issue persists on 0.1.56 |
thanks! will observe and report back in a couple of days |
Describe your configuration
BigQuery Dataset location: europe-****
BigQuery Project ID: ***
Collection path: ***
Enable Wildcard Column field with Parent Firestore Document IDs: true
Dataset ID: ***
Table ID: ***
BigQuery SQL table Time Partitioning option type: DAY
BigQuery Time Partitioning column name: timestamp
Firestore Document field name for BigQuery SQL Time Partitioning field option: timestamp
BigQuery SQL Time Partitioning table schema field(column) type: TIMESTAMP
BigQuery SQL table clustering: ***
Maximum number of synced documents per second: 100
Backup Collection Name: Parameter not set
Transform function URL: Parameter not set
Use new query syntax for snapshots: no
Exclude old data payloads: no
Use Collection Group query: no
Cloud KMS key name: Parameter not set
Describe the problem
in the last 30 days I can find 4 occurrences of this error:
Error when mirroring data to BigQuery FirebaseFunctionsError: Unexpected response with status: 502
on/around these timestamps we have updates missing for Firestore documents in BQ.
fix: a manual edit of the FS doc in question resulted in a sync to BQ (as expected).
the error contains a HTML response;
stack trace mentioned in the prntscrn;
at FunctionsApiClient.toFirebaseError (/workspace/node_modules/firebase-admin/lib/functions/functions-api-client-internal.js:297:20) at FunctionsApiClient.enqueue (/workspace/node_modules/firebase-admin/lib/functions/functions-api-client-internal.js:146:32) at process.processTicksAndRejections (node:internal/process/task_queues:95:5) at async /workspace/lib/index.js:108:9 { errorInfo: { code: 'functions/unknown-error', message: 'Unexpected response with status: 502 }
Steps to reproduce:
unsure how to reproduce as I do not fully understand the problem, however, as indicated by the error, a 502, it seems the error occurs in the function running the firestore-bigquery-export business logic.
Expected result
to have every update for relevant Firestore docs synced to BQ
Actual result
missing updates for Firestore docs in BQ
The text was updated successfully, but these errors were encountered: