Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 [firestore-bigquery-export] Error initializing raw change log table #2134

Closed
boywijnmaalen opened this issue Jul 23, 2024 · 17 comments · Fixed by #2255
Closed

🐛 [firestore-bigquery-export] Error initializing raw change log table #2134

boywijnmaalen opened this issue Jul 23, 2024 · 17 comments · Fixed by #2255
Assignees
Labels
extension: firestore-bigquery-export Related to firestore-bigquery-export extension type: bug Something isn't working type: question Further information is requested

Comments

@boywijnmaalen
Copy link

Describe your configuration

  • Extension name: firestore-bigquery-export
  • Extension version: 0.1.51
  • Configuration values:
    BigQuery Dataset location: europe-****
    BigQuery Project ID: ***
    Collection path: ***
    Enable Wildcard Column field with Parent Firestore Document IDs: true
    Dataset ID: ***
    Table ID: ***
    BigQuery SQL table Time Partitioning option type: DAY
    BigQuery Time Partitioning column name: timestamp
    Firestore Document field name for BigQuery SQL Time Partitioning field option: timestamp
    BigQuery SQL Time Partitioning table schema field(column) type: TIMESTAMP
    BigQuery SQL table clustering: ***
    Maximum number of synced documents per second: 100
    Backup Collection Name: Parameter not set
    Transform function URL: Parameter not set
    Use new query syntax for snapshots: no
    Exclude old data payloads: no
    Use Collection Group query: no
    Cloud KMS key name: Parameter not set

Describe the problem

in the last 30 days I can find 14 occurrence of this error:

Unhandled error Error: Error initializing raw change log table: Field data has type JSON, which is not supported for clustering.

image

I could find at least 1 (but won't rule out more) missing updates for Firestore documents in BQ.

fix: a manual edit of the FS doc in question resulted in a sync to BQ (as expected).

stack trace;

Unhandled error Error: Error initializing raw change log table: Field data has type JSON, which is not supported for clustering. at FirestoreBigQueryEventHistoryTracker.initialize (/workspace/node_modules/@firebaseextensions/firestore-bigquery-change-tracker/lib/bigquery/index.js:192:23) at process.processTicksAndRejections (node:internal/process/task_queues:95:5) at async /workspace/lib/index.js:134:5 at async /workspace/node_modules/firebase-functions/lib/common/providers/tasks.js:74:17

Steps to reproduce:

unsure how to reproduce as I do not fully understand the problem

Expected result

to have every update for relevant Firestore docs synced to BQ

Actual result

missing update(s) for Firestore docs in BQ

@boywijnmaalen boywijnmaalen added the type: bug Something isn't working label Jul 23, 2024
@cabljac cabljac added the extension: firestore-bigquery-export Related to firestore-bigquery-export extension label Jul 23, 2024
@cabljac
Copy link
Contributor

cabljac commented Jul 23, 2024

This is strange to me, as we don't use the datatype JSON as far as I know, data is saved in a raw string.

Do you have any more info on your Clustering you're able to provide?

@boywijnmaalen
Copy link
Author

boywijnmaalen commented Jul 23, 2024

Hi @cabljac,

thanks for your quick reply;

schema:
image

(part of) table info;
image

@cabljac
Copy link
Contributor

cabljac commented Jul 24, 2024

Was this schema solely generated by the extension?

@boywijnmaalen
Copy link
Author

boywijnmaalen commented Jul 25, 2024

I don't really remember (the use of this extension has been in place for over 3 years)

tbh, does it matter? clustering by JSON type fields is not at all possible in BQ. our two JSON fields (data, old_data) are therefore not part of the defined clustered fields.

the error does not make sense to me. it seems somehow internally the data/old_data fields are used in such a way that is incompatible with JSON type.

what also strikes me as odd, is the fact the errors seems to be randomly appearing, if it really was related to JSON types fields then I'd expect this error to occur for every firestore>bigquery event. but this is not the case.

@boywijnmaalen
Copy link
Author

boywijnmaalen commented Jul 25, 2024

could this error be related to the fact we updated the extension to version 0.1.51?

I'm unsure when we updated this extension however it must have been recently as this version was released on June 19th

@cabljac
Copy link
Contributor

cabljac commented Jul 25, 2024

I'll have a look back at the release commits and see if anything could have caused this to start happening. Thanks for providing all this info by the way!

I'll see if we can get this issue prioritised

@boywijnmaalen
Copy link
Author

@cabljac

wondering if you have an update on this?

@boywijnmaalen
Copy link
Author

boywijnmaalen commented Sep 9, 2024

@cabljac

a kind reminder 🙏🏾

do you have an update on a potential fix?

@boywijnmaalen
Copy link
Author

@cabljac

apologies for chasing you once again, is there an update/ETA for this issue?

@Gustolandia Gustolandia self-assigned this Nov 29, 2024
@Gustolandia
Copy link
Contributor

Hi @boywijnmaalen,

Thank you for providing the detailed context and information about the issue. Based on the schema and table info shared in the screenshots, it appears that clustering is configured correctly on supported fields (document_name, operation, and ingestion) and does not include JSON fields like data or old_data, which aligns with BigQuery's limitations.

To help us further debug this issue, I have a few follow-up questions:

  1. Are you currently using the latest version of the extension (v0.1.56)? If not, could you update to the latest version and see if the issue persists?
  2. Does the schema in your environment match the schema provided in the screenshots? Specifically, are the clustering fields (document_name, operation, ingestion) the same?
  3. Have there been any manual changes made to the table schema or configuration (e.g., adding unsupported fields like JSON for clustering)?
  4. Is the issue occurring consistently, or does it happen randomly? If random, do you notice any patterns, such as specific document types or operations causing the error?
  5. Could you confirm whether the dataset and table configurations (partitioning by ingestion, clustering on valid fields, etc.) match those shown in the screenshots?
  6. Logs for Errors: Could you share any specific logs or error messages that occurred when the issue arose? This can help us pinpoint where the failure might be happening.
  7. Extension Settings: Could you share the full list of configuration settings for your extension, similar to the ones you've provided in the original post? This includes things like Wildcard Column, Collection Group Queries, or any custom transforms.
  8. Historical Version Info: Do you know if this issue started occurring after upgrading to a specific version of the extension? If possible, could you confirm which version was in use before this issue started?
  9. Document Details: Are there any specific characteristics of the documents that failed to sync (e.g., larger size, nested fields, special characters)? If so, providing an example document structure would be helpful.
  10. Error Frequency: Approximately how often does this error occur? For example, is it happening with every update, or just in a small subset of events?
  11. Confirmation of Manual Sync Fix: You mentioned a manual edit of the Firestore document resolved the issue temporarily. Can you elaborate on what kind of edit you made? Was it just an update to a field, or something more substantial?
  12. BigQuery Table State: Have you inspected the raw changelog table in BigQuery to confirm whether the missing data or updates exist there? This might indicate whether the issue is related to syncing or querying the data.
  13. Firestore Write Patterns: Do you have any unusual Firestore write patterns (e.g., bulk writes, batched updates, or high-frequency writes)? If so, this might stress the extension's sync mechanism.
  14. Reproducibility: If this issue is not completely random, is there any way to reproduce it in a test environment? For example, creating a specific document or running a specific operation that tends to fail.
    These details will help us narrow down the root cause and determine whether this might be related to an outdated version, a specific configuration, or an intermittent issue in the extension.

Looking forward to your response.

Best regards,
Gustavo
(I'm cleaning all the backlog of the extensions)

@Gustolandia Gustolandia added the type: question Further information is requested label Dec 1, 2024
@boywijnmaalen
Copy link
Author

Are you currently using the latest version of the extension (v0.1.56)? If not, could you update to the latest version and see if the issue persists?

v0.1.56 was released ~2 weeks ago (as it contains a possible fix for another issue I raised). This version has been running on test/acceptance since it was released.
on Dec 2nd ~07:30 CEST, I updated our production environment to v0.1.56 as well. One thing I can say is the issue is occurring a lot more than before.

last BQ error occurred Nov 16th, however after updating the extension the error mentioned in this thread occurred 13 times.

search query;
resource.labels.function_name=~"ext-firestore-bigquery-export" severity=ERROR

search result;
Screenshot 2024-12-03 at 13 48 31

Does the schema in your environment match the schema provided in the screenshots? Specifically, are the clustering fields (document_name, operation, ingestion) the same?

Yes.
image

Have there been any manual changes made to the table schema or configuration (e.g., adding unsupported fields like JSON for clustering)?

No.

Is the issue occurring consistently, or does it happen randomly? If random, do you notice any patterns, such as specific document types or operations causing the error?

No, the errors seem to be totally random. I cannot deduct any patterns, nor do we have unexpected number of Firestore writes. our application is a no-peak-traffic app and is mostly busy during business hours (7 days a week).

Could you confirm whether the dataset and table configurations (partitioning by ingestion, clustering on valid fields, etc.) match those shown in the screenshots?

Yes, the structure remains unchanged

Logs for Errors: Could you share any specific logs or error messages that occurred when the issue arose? This can help us pinpoint where the failure might be happening.

While searching for the logs you requested I think I noticed some thing odd. I filtered out one of the errors and then searched for all the logs matching the error's execution ID.
From what I understand from the result, is that the extension is trying to 'update' the clustered field setting, it is trying the use 'data' as a clustered field too. (but as you can see 'data' is not part of the BQ table clustered field definition). I looked at the other 12 errors, it seems all errors have the same logging result when search for the respective execution ID;

Screenshot 2024-12-03 at 13 58 25

Extension Settings: Could you share the full list of configuration settings for your extension, similar to the ones you've provided in the original post? This includes things like Wildcard Column, Collection Group Queries, or any custom transforms.

I think I found the issue (encircled in red). it seems I added 'data' as a clustered field, this is a mistake, I know full well 'data' is a JSON field.

Screenshot 2024-12-03 at 14 20 02

Historical Version Info: Do you know if this issue started occurring after upgrading to a specific version of the extension? If possible, could you confirm which version was in use before this issue started?

we're having this issue for a while. but it seems it only happens when we update the extension.

Document Details: Are there any specific characteristics of the documents that failed to sync (e.g., larger size, nested fields, special characters)? If so, providing an example document structure would be helpful.

document is nested, but not too deep, max 2 levels, as a whole each document is ~2100 chars long when converted to JSON.

Error Frequency: Approximately how often does this error occur? For example, is it happening with every update, or just in a small subset of events?

when I come to think of it, it seems to only occur when updating the extension.

Confirmation of Manual Sync Fix: You mentioned a manual edit of the Firestore document resolved the issue temporarily. Can you elaborate on what kind of edit you made? Was it just an update to a field, or something more substantial?

while answering all your questions, I think this remark was related to the issue mentioned in #2133, not this issue, apologies.

BigQuery Table State: Have you inspected the raw changelog table in BigQuery to confirm whether the missing data or updates exist there? This might indicate whether the issue is related to syncing or querying the data.

There was actually missing data when manually looking for it. but this one is also related to the issue #2133.

Firestore Write Patterns: Do you have any unusual Firestore write patterns (e.g., bulk writes, batched updates, or high-frequency writes)? If so, this might stress the extension's sync mechanism.

Nothing odd from an application perspective, however, from time to time we do migrations in which we touch all documents but during these updates no issued arise.

Reproducibility: If this issue is not completely random, is there any way to reproduce it in a test environment? For example, creating a specific document or running a specific operation that tends to fail.

already answered, seems totally random.

Tomorrow morning (CEST) I'll reconfigure the extension, to not include the 'data' column for the clustered fields setting. I think the current value for this setting is most probably the issue we're seeing.

will revert back once I re-configured the extension.

for now a final question; if the errors are indeed related to updating the extension, why is it occurring 13 times and seemingly at no logical intervals (except for the last ~8 errors, which are 1h apart)?

apologies for wasting your time, you certainly didn't waste mine, answering your questions helped me find the issue (well, it surely seems this is the issue)

TBC

@Gustolandia
Copy link
Contributor

for now a final question; if the errors are indeed related to updating the extension, why is it occurring 13 times and seemingly at no logical intervals (except for the last ~8 errors, which are 1h apart)?

No idea. I failed to reproduce the issue locally with the information you had provided, that is why I was trying to get more info.

I think I found the issue (encircled in red). it seems I added 'data' as a clustered field, this is a mistake, I know full well 'data' is a JSON field.

Let me know. I hope it will solve it. ;)

@Gustolandia
Copy link
Contributor

Hi @boywijnmaalen,

Is the issue solved?

@boywijnmaalen
Copy link
Author

I cannot tell for sure as these errors only occur when the extension is updated. There has not been a new release (that I am aware of) since our last communication.

@cabljac
Copy link
Contributor

cabljac commented Jan 6, 2025

for now a final question; if the errors are indeed related to updating the extension, why is it occurring 13 times and seemingly at no logical intervals (except for the last ~8 errors, which are 1h apart)?

This could be due to exponential backoff settings in cloud tasks. I believe this table is initialised via an onConfigure or onInstall cloud task queue, and perhaps there's a maximum (1hr) to the backoff. I would have to investigate further to confirm

@cabljac
Copy link
Contributor

cabljac commented Jan 17, 2025

an update to this:

Looking into the code, i'm not sure how data became a JSON type in your schema through the extension alone. is this something you changed via replacing the table or similar?

Either way, I have PRs to safeguard against this:

#2256
#2256

With this change, a column with an invalid type specified in the clustering param will cause the extension not to update/add clustering on the table, and log a warning instead - similar to how the extension behaves currently if a specified clustering param does not exist in the schema

@cabljac cabljac assigned dackers86 and unassigned cabljac Jan 17, 2025
@boywijnmaalen
Copy link
Author

is this something you changed via replacing the table or similar?

yes, this is exactly what we did. turning it into a JSON object allowed us to reduce bytes read when selecting only specific properties from our objects. basically what happened is, we installed the extension, then two years passed, we turned the data column into a JSON field by re-creating the table and copying over all data, some more time passed and we updated the extension and noticed errors in the logs. the error was actually quite clear on describing the issue, when looking back now, however I never linked the initial extension config (which did include data as a clustering column) with the error, until now.

we have updated the extension config so it no longer includes the data column for clustering.

With this change, a column with an invalid type specified in the clustering param will cause the extension not to update/add clustering on the table, and log a warning instead - similar to how the extension behaves currently if a specified clustering param does not exist in the schema

that is perfect, that way it will become even more clear.

thanks for your effort and taking the time to once more reply to this issue 👍🏾

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
extension: firestore-bigquery-export Related to firestore-bigquery-export extension type: bug Something isn't working type: question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants