-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix 69 - Refactor how arguments are added to scripts #102
Conversation
Signed-off-by: Miguel Martínez <[email protected]>
Signed-off-by: miguelusque <[email protected]>
Signed-off-by: miguelusque <[email protected]>
Signed-off-by: miguelusque <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes. Most of my comments are around the same topic:
- There are some parts of the codebase that repeat the same arguments across multiple files. It makes sense to group some of those into the common file for reuse. For other arguments, that are very specific to a use case or notebook, do you have a strong opinion on why they should reside in a common class rather than added directly in the example itself?
Signed-off-by: miguelusque <[email protected]>
Signed-off-by: miguelusque <[email protected]>
Signed-off-by: miguelusque <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general I like this idea going forward, but I think I want to see a couple of changes. Relating to Ayush's review summary, I think that common arguments (like those found in add_distributed_args
) should be grouped together. I think this class also makes more sense as a home for add_distributed_args
and similar functions rather than them being entirely independent.
However, I think if there isn't any reuse of a list of arguments in a script, we shouldn't make a method in ArgumentHelper
for it. Arguments that are hyper-specific to each script should remain only in that script and not be abstracted away. I'm fine abstracting away collections of arguments that are used in 2+ scripts though (like input-data-dir
, input-type
, etc.). So I think I'd like to see ArgumentHelper
shrink a bit for now and return a lot of the niche arguments to their original scripts.
Signed-off-by: miguelusque <[email protected]>
Signed-off-by: miguelusque <[email protected]>
Signed-off-by: miguelusque <[email protected]>
Signed-off-by: miguelusque <[email protected]>
Hi Ryan, Thank you! Just done! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This version is great and much more in line with what I think is the most useful. I found a couple of outdated arguments (that were a lot easier to spot thanks to you) that would be good to fix before merging this in. Once all my changes are addressed and I verify via the CI that everything is ok, we should be good.
examples/k8s/create_dask_cluster.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Leave this file unchanged. ArgumentHelper
isn't being used and the rest of the changes are inconsequential.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still needs to be addressed.
The former message was out of date. Co-authored-by: Ryan Wolf <[email protected]> Signed-off-by: Miguel Martínez <[email protected]>
Improve help wording for output_data_dir argument Co-authored-by: Ryan Wolf <[email protected]> Signed-off-by: Miguel Martínez <[email protected]>
Update help wording for ouptut-data-dir argument Co-authored-by: Ryan Wolf <[email protected]> Signed-off-by: Miguel Martínez <[email protected]>
Co-authored-by: Ryan Wolf <[email protected]> Signed-off-by: Miguel Martínez <[email protected]>
Co-authored-by: Ryan Wolf <[email protected]> Signed-off-by: Miguel Martínez <[email protected]>
There is a typo in help for argument output-model Co-authored-by: Ryan Wolf <[email protected]> Signed-off-by: Miguel Martínez <[email protected]>
Signed-off-by: miguelusque <[email protected]>
Signed-off-by: miguelusque <[email protected]>
Signed-off-by: miguelusque <[email protected]>
Signed-off-by: miguelusque <[email protected]>
Signed-off-by: miguelusque <[email protected]>
Signed-off-by: miguelusque <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one comment of mine is left to be addressed. Other than that, should be good! I'll run the CI now just to double check
examples/k8s/create_dask_cluster.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still needs to be addressed.
Signed-off-by: Miguel Martínez <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thanks again for this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirmed it works with the fuzzy dedup cli pipeline
Description
I have noticed that the code used to parse the scripts is repeated in multiple places.
I have refactored the different
parser.add_argument
calls, grouping them in a single file that is consumed by the different scripts.