-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Support multiple DagProcessors parsing files from different locations. #25935
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
9dd1396 to
fb3298a
Compare
|
Not sure what is the state of releasing 2.4.0 - if we can't fit this PR then it may wait until 2.5.0 I believe. |
ashb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than a global variable (which is what DagProcessorDirectory is, how about:
Add an attribute to the DagFileProcessorProcess constructor (which is passed down from the Manager), and add a dag_directory argument to DagBag.sync_to_db which can get passed down to DAG.bulk_write_to_db and SerializedDagModel.write_dag
7c76fec to
edab1fc
Compare
Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
|
|
||
| # Only applicable if `[scheduler]standalone_dag_processor` is true. | ||
| # Time in seconds after which dags, which were not updated by Dag Processor are deactivated. | ||
| dag_stale_not_seen_duration = 600 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dag_stale_not_seen_duration --> any suggestion for a better name? this config name isn't easy understand
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
eh.. naming things...
Let me be wild on that one:
deactivation_time_for_missing_dags_in_standalone_dag_processor_mode
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another suggestion from @ashb was mark_dag_stale_not_seen_in
Is it better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of them are awful :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually yours @potiuk helped me to understand the meaning of that parameter ;p
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Horrible names can also be best :)
Support running multiple standalone DagProcessor each configured to parse dags from different directory.
Changes:
Usage:
Part of https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-43+DAG+Processor+separation