Skip to content

chore(source-google-search-console): Migrate to CDK v6#59140

Merged
Christo Grabowski (ChristoGrab) merged 10 commits intomasterfrom
christo/gsc-cdk6
May 6, 2025
Merged

chore(source-google-search-console): Migrate to CDK v6#59140
Christo Grabowski (ChristoGrab) merged 10 commits intomasterfrom
christo/gsc-cdk6

Conversation

@ChristoGrab
Copy link
Contributor

@ChristoGrab Christo Grabowski (ChristoGrab) commented Apr 29, 2025

What

Migrates google-search-console to the latest major CDK version, in order to facilitate the migration of streams from the legacy CDK to low-code.

User Impact

Hopefully none, if any should be positive in the form of more mature CDK logic.

Can this PR be safely reverted and rolled back?

  • YES 💚
  • NO ❌

@vercel
Copy link

vercel bot commented Apr 29, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
airbyte-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback May 6, 2025 0:14am

with open(config_path, "w") as updated_config:
config = json.dumps(config)
updated_config.write(config)
def backup_config():
Copy link
Contributor Author

@ChristoGrab Christo Grabowski (ChristoGrab) Apr 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The revert_migration test function was reformatting the local test_config.json file on each run, leading to formatting issues with our current pre-commit hooks. Not a big deal, but could be annoying for local dev work. Switching to a backup/restore of the original file fixed the issue for me.

requests-mock = "^1.11.0"
pytest-lazy-fixture = "^0.6.3"
pytest = "^8.0.0"
pytest = "^7.4.0"
Copy link
Contributor Author

@ChristoGrab Christo Grabowski (ChristoGrab) Apr 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted because pytest 8.0 does not play nice with the pytest-lazy-fixture dependency which is used in unit tests. Guessing there's a decent chance that unit tests will have to be adjusted anyway as we're migrating streams, so maybe we can rip out this dependency soon, but feels out of scope for this PR.

@ChristoGrab
Copy link
Contributor Author

Kicked off a regression test here. Might take a while to finish, this source has some beefy read times:

https://github.com/airbytehq/airbyte/actions/runs/14742276635/job/41382826417

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw from the regression tests that some of the streams have switched to being resumable. What are those streams? Should we disable this to keep the same behavior?

@ChristoGrab
Copy link
Contributor Author

Christo Grabowski (ChristoGrab) commented Apr 30, 2025

/format-fix

Format-fix job started... Check job output.

✅ Changes applied successfully. (d88bdbf)

@ChristoGrab
Copy link
Contributor Author

Christo Grabowski (ChristoGrab) commented Apr 30, 2025

I saw from the regression tests that some of the streams have switched to being resumable. What are those streams? Should we disable this to keep the same behavior?

Maxime Carbonneau-Leclerc (@maxi297) Was having a hard time interpreting the test report to figure out which streams were affected, but checking locally I noticed that sites and sitemaps streams had switched from false to true by default with the CDK bump. I've disabled the flag for those streams and will see if the diff still pops up in regression tests.

Test run url: https://github.com/airbytehq/airbyte/actions/runs/14762282210

@maxi297
Copy link
Contributor

There seems to be some mismatches when syncing using a state. Are we able to explain these?

@ChristoGrab
Copy link
Contributor Author

Christo Grabowski (ChristoGrab) commented May 1, 2025

Maxime Carbonneau-Leclerc (@maxi297) I don't think the record diffs in search_analytics_keyword_page_report are too worrying as the reads are so long and the data could be getting updated. Definitely interesting that search_analytics_keyword_site_report_by_site didn't return any records though, running another round of regression tests just for that stream to make sure it wasn't a fluke.

Link here: https://github.com/airbytehq/airbyte/actions/runs/14781221599/job/41500384114

NOTE: tested locally and didn't see anything suspicious from search_analytics_keyword_site_report_by_site, it hit a 400 error that was handled gracefully and synced records as expected

@ChristoGrab
Copy link
Contributor Author

Christo Grabowski (ChristoGrab) commented May 1, 2025

There is definitely something up with the search_analytics_keyword_site_report_by_site stream, it's still returning no records in the target version... Everything seems fine when I run locally, I'll keep poking though. What's especially odd is there were no diffs in record count before I flipped the is_resumable flag for sites and sitemaps. I'm not clear on how that could have affected this stream, but will try to verify if it's the cause somehow.

@ChristoGrab
Copy link
Contributor Author

Christo Grabowski (ChristoGrab) commented May 1, 2025

Maxime Carbonneau-Leclerc (@maxi297) I looked into it and I think this may actually be a case of incremental behavior that was not previously working for search_analytics_keyword_site_report_by_site. Running an incremental read with state set to { "date": "2050-08-28" } returns no records as expected on the updated version, but on master 100 records are still read.

I say this with the confidence of someone who is wrong about these things 95% of the time, but I think it may be worth moving forward with a progressive rollout.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you think it's good to move forward with this, I'll approve!

@ChristoGrab
Copy link
Contributor Author

Christo Grabowski (ChristoGrab) commented May 6, 2025

/approve-regression-tests

Record mismatch in search_analytics_keyword_site_report_by_site seems to be related to incremental reads correctly skipping records with cursor values < stream state

Check job output.

✅ Approving regression tests

@ChristoGrab Christo Grabowski (ChristoGrab) merged commit 8b49bf0 into master May 6, 2025
29 checks passed
@ChristoGrab Christo Grabowski (ChristoGrab) deleted the christo/gsc-cdk6 branch May 6, 2025 18:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants