-
Notifications
You must be signed in to change notification settings - Fork 246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Set param from config #553
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #553 +/- ##
=======================================
Coverage 99.50% 99.51%
=======================================
Files 41 41
Lines 2219 2251 +32
=======================================
+ Hits 2208 2240 +32
Misses 11 11 ☔ View full report in Codecov by Sentry. |
Do we have all the codes in this new branch? I remember we need to change the Schema model in #545 |
Sorry, there were some problems when my local branch was synchronized from upstream changes of the master. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change config at runtime is a dangerous thing as we discussed previously under multi-thread context. Although we normally encourage user to use multi-process for bulk process, but thread-safety is still something we need to guarantee given SQLLineageConfig is a global variable.
With that in mind, I'd like to propose this API leveraging Python context manager:
with SQLLineageConfig.set(DEFAULT_SCHEMA="ods", TSQL_NO_SEMICOLON=True):
runner = LineageRunner(sql)
print(runner.source_tables)
Once entering the with block, those config are temporarily set to another value for current thread and are only readable to current thread. And the temporary value is cleaned up after exit with block.
At the heart of the implementation, we should have a data structure like:
{
"thread_1": {"DEFAULT_SCHEMA": "ods", "TSQL_NO_SEMICOLON": True},
"thread_2": {"DEFAULT_SCHEMA": "default"}
}
which takes highest priority > environment variable > default value
redo rebase and clean submit
* fix: misidentify-column-name-as-alias (reata#539) * add LATERAL_COLUMN_ALIAS_REFERENCE in SQLLineageConfig * adjust import order * add test_column_top_level_enable_lateral_ref_with_metadata_from_nested_subquery * unknown * refactor: rebase master and convert LATERAL_COLUMN_ALIAS_REFERENCE to bool type * refactor: use as few condition as possible: SQLLineageConfig.LATERAL_COLUMN_ALIAS_REFERENCE * refactor: rebase master and resolve conflict * refactor: move logic from to_source_columns to end_of_query_cleanup * refactor: rebase master and fix black format * docs: LATERAL_COLUMN_ALIAS_REFERENCE how-to guide * docs: starting version for each config --------- Co-authored-by: reata <[email protected]>
…e test of multi-process scenario.
a21e4e9
to
b0260fc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please stick to our design that user need to explicitly call set method instead of __setattr__:
with SQLLineageConfig.set(DEFAULT_SCHEMA="ods", TSQL_NO_SEMICOLON=True):
runner = LineageRunner(sql)
print(runner.source_tables)
…param_from_config # Conflicts: # sqllineage/config.py # tests/core/test_config.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* fix: Set param from config reata#545 redo rebase and clean submit * fix: misidentify column name as lateral alias (reata#540) * fix: misidentify-column-name-as-alias (reata#539) * add LATERAL_COLUMN_ALIAS_REFERENCE in SQLLineageConfig * adjust import order * add test_column_top_level_enable_lateral_ref_with_metadata_from_nested_subquery * unknown * refactor: rebase master and convert LATERAL_COLUMN_ALIAS_REFERENCE to bool type * refactor: use as few condition as possible: SQLLineageConfig.LATERAL_COLUMN_ALIAS_REFERENCE * refactor: rebase master and resolve conflict * refactor: move logic from to_source_columns to end_of_query_cleanup * refactor: rebase master and fix black format * docs: LATERAL_COLUMN_ALIAS_REFERENCE how-to guide * docs: starting version for each config --------- Co-authored-by: reata <[email protected]> * feat:SQLLineageConfig supports set value and thread safety * fix: Fix mypy error * fix: Fix pytest cov * fix: Fix the scenario of direct assignment without using with. Add the test of multi-process scenario. * fix: add SQLLineageConfigLoader set function * feat: disable setattr for SQLLineageConfig * feat: make SQLLineageConfig context manager non-reentrant * feat: disable set unknown config * feat: access config in parallel * chore: disable A005 for module name builtin conflict * refactor: classmethod to staticmethod --------- Co-authored-by: liuzhou <[email protected]> Co-authored-by: maoxd <[email protected]> Co-authored-by: reata <[email protected]>
Dear ALL:
目前模型的默认模式名是通过环境变量传递的,我认为实现方法不够python。
所以,我实现了另一种解决方法。
修改 sqllineage/config.py 实现配置项的读取和写入, 读取方法保持和已有方法兼容。
修改 sqllineage/core/models.py 的Schema 类,在每次初始化的时候动态读取配置
修改 sqllineage/runner.py的init方法,传入变量,并在 SQLLineageConfig 赋值
请大家提出意见,谢谢
Currently the default schema name of the model is passed through environment variables, and I think the implementation method is not python enough.
So, I implemented another workaround.
Modify sqllineage/config.py to implement reading and writing of configuration items. The reading method remains compatible with the existing method.
Modify the Schema class of sqllineage/core/models.py and dynamically read the configuration every time it is initialized.
Modify the init method of sqllineage/runner.py, pass in variables, and assign values in SQLLineageConfig
Please give your opinions, thank you
重新建立分支并提交
Re-branch and submit
first step to solve #536