Fix DbtVirtualenvBaseOperator to use correct virtualenv Python path#1252
Conversation
- Log Python binary info before command execution when using virtualenv - Reorder log statements for better readability and debugging
DAG pickling caused __init__-assigned methods to reference original instance state instead of newly created instances. Switch to properties for dynamic method assignment to ensure correct instance reference. Update tests to cover new behavior.
✅ Deploy Preview for sunny-pastelito-5ecb04 canceled.
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1252 +/- ##
=======================================
Coverage 95.73% 95.73%
=======================================
Files 67 67
Lines 3965 3967 +2
=======================================
+ Hits 3796 3798 +2
Misses 169 169 ☔ View full report in Codecov by Sentry. |
tatiana
left a comment
There was a problem hiding this comment.
HI @kesompochy thank you very much for working on this.
When we write code, there is always the risk we'll be introducing bugs, and in this particular case, the reviewer is as accountable as the author. So, please, don't blame yourself.
As you mentioned, I really believe we need to write an integration test to try to reproduce the original issue - or a scenario closer to it, that fails before, and passes with this fix.
I'll take sometime to reproduce the issue and try to give ideas on how we could accomplish this. We'll be releasing this in a patch release of Cosmos, if not 1.7.1, then 1.7.2.
tatiana
left a comment
There was a problem hiding this comment.
@kesompochy I've added a test more representative of the change in #1286. I'll merge your PR and the test I created to validate this change.
Bug fixes * Fix ``DbtVirtualenvBaseOperator`` to use correct virtualenv Python path by kesompochy in #1252 * Fix displaying dbt docs as menu item in Astro by @tatiana in #1280 * Fix: Replace login by user for clickhouse profile by @petershenri in #1255 Enhancements * Improve dbt Docs Hosting Debugging -- Update dbt_docs_not_set_up.html by @johnmcochran in #1250 * Minor refactor on VirtualenvOperators & add test for PR by @tatiana in #1253 Docs * Add Welcome Section and "What Is Cosmos" Blurb to Home Page by @cmarteepants and @yanmastin-astro in #1251 * Update the URL for sample dbt docs hosted in Astronomer S3 bucket by @pankajkoti in #1283 * Add dedicated scarf tracking pixel to readme by @cmarteepants in #1256 Others * Update ``CODEOWNERS`` to track all files by @pankajkoti in #1284 * Fix release after the ``raw`` rst directive disabled was disabled in PyPI by @tatiana in #1282 * Update issue template ``bug.yml`` - cosmos version update in the dropdown by @pankajkoti in #1275 * Pre-commit hook updates in #1285, #1274, #1254, #1244" error: pathspec 'Is' did not match any file(s) known to git error: pathspec 'Cosmos Blurb to Home Page by @cmarteepants and @yanmastin-astro in #1251 * Update the URL for sample dbt docs hosted in Astronomer S3 bucket by @pankajkoti in #1283 * Add dedicated scarf tracking pixel to readme by @cmarteepants in #1256 Others * Update CODEOWNERS to track all files by @pankajkoti in #1284 * Fix release after the raw rst directive disabled was disabled in PyPI by @tatiana in #1282 * Update issue template bug.yml - cosmos version update in the dropdown by @pankajkoti in #1275 * Pre-commit hook updates in #1285, #1274, #1254, #1244
) Cosmos virtualenv operators are using the system dbt instead of the virtualenv dbt. Create a test case that illustrates issue #1246. This test fails with Cosmos 1.7 (and the current main branch) and passes when using PR #1252. This PR also introduces two refactors: - Reuse the parent class method where applicable, as opposed to re-writing it completely - Force the Virtualenv invocation mode to be `SUBPROCESS ` since Airflow/Cosmos are not able to import dbt as a library if it is not part of the same Python virtualenv
Bug fixes * Fix ``DbtVirtualenvBaseOperator`` to use correct virtualenv Python path by kesompochy in #1252 * Fix displaying dbt docs as menu item in Astro by @tatiana in #1280 * Fix: Replace login by user for clickhouse profile by @petershenri in #1255 Enhancements * Improve dbt Docs Hosting Debugging -- Update dbt_docs_not_set_up.html by @johnmcochran in #1250 * Minor refactor on VirtualenvOperators & add test for PR by @tatiana in #1253 Docs * Add Welcome Section and "What Is Cosmos" Blurb to Home Page by @cmarteepants and @yanmastin-astro in #1251 * Update the URL for sample dbt docs hosted in Astronomer S3 bucket by @pankajkoti in #1283 * Add dedicated scarf tracking pixel to readme by @cmarteepants in #1256 Others * Update ``CODEOWNERS`` to track all files by @pankajkoti in #1284 * Fix release after the ``raw`` rst directive disabled was disabled in PyPI by @tatiana in #1282 * Update issue template ``bug.yml`` - cosmos version update in the dropdown by @pankajkoti in #1275 * Pre-commit hook updates in #1285, #1274, #1254, #1244" error: pathspec 'Is' did not match any file(s) known to git error: pathspec 'Cosmos Blurb to Home Page by @cmarteepants and @yanmastin-astro in #1251 * Update the URL for sample dbt docs hosted in Astronomer S3 bucket by @pankajkoti in #1283 * Add dedicated scarf tracking pixel to readme by @cmarteepants in #1256 Others * Update CODEOWNERS to track all files by @pankajkoti in #1284 * Fix release after the raw rst directive disabled was disabled in PyPI by @tatiana in #1282 * Update issue template bug.yml - cosmos version update in the dropdown by @pankajkoti in #1275 * Pre-commit hook updates in #1285, #1274, #1254, #1244
Bug fixes * Fix ``DbtVirtualenvBaseOperator`` to use correct virtualenv Python path by @kesompochy in #1252 * Fix displaying dbt docs as menu item in Astro by @tatiana in #1280 * Fix: Replace login by user for clickhouse profile by @petershenri in #1255 Enhancements * Improve dbt Docs Hosting Debugging -- Update dbt_docs_not_set_up.html by @johnmcochran in #1250 * Minor refactor on VirtualenvOperators & add test for PR by @tatiana in #1253 Docs * Add Welcome Section and "What Is Cosmos" Blurb to Home Page by @cmarteepants and @yanmastin-astro in #1251 * Update the URL for sample dbt docs hosted in Astronomer S3 bucket by @pankajkoti in #1283 * Add a dedicated scarf tracking pixel to readme by @cmarteepants in #1256 Others * Update ``CODEOWNERS`` to track all files by @pankajkoti in #1284 * Fix release after the ``raw`` rst directive disabled was disabled in PyPI by @tatiana in #1282 * Update issue template ``bug.yml`` - cosmos version update in the dropdown by @pankajkoti in #1275 * Pre-commit hook updates in #1285, #1274, #1254, #1244
|
@tatiana, Thank you for reviewing and for adding such a comprehensive test case. Using |
Description
This PR addresses an issue where the
DbtVirtualenvBaseOperatorwas executing dbt commands using the system-wide Python path instead of the virtualenv path. The root cause was that the self reference in therun_subprocessmethod was bound to a different instance than the one created during initialization, likely due to Airflow's DAG pickling mechanism.To resolve this, we've refactored the
invoke_dbtandhandle_exceptionmethods to be properties. This ensures that they dynamically reference the correct method of the current instance at runtime, rather than being bound to a potentially stale instance from initialization.Related Issue(s)
fix #1246
This may be related to issue #958 in version 1.5.0
Breaking Change?
No
Checklist
Additional Notes
I acknowledge that ideally, a test should be added to reproduce the original issue and verify the fix. However, I found it challenging to create an appropriate test, especially considering that this might require an integration test with Airflow to properly simulate Airflow behavior. If there are suggestions for how to effectively test this scenario, I would greatly appreciate the guidance.
I sincerely apologize for introducing this bug in the first place with PR #1200. I understand this has caused inconvenience, and I'm grateful for the opportunity to fix it. I kindly request a thorough review of these changes to ensure we've fully addressed the issue without introducing new problems.