-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/md5 usedforsecurity #17866
Feature/md5 usedforsecurity #17866
Conversation
maybe @alangenfeld or @smackesey or @gibsondan might be good reviewers for this? |
correct these are not security based hashes so i think this PR is fine once the inlines are addressed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me. It looks like buildkite is failing on ruff
you'll need to run make ruff
and push up the changes
oof, looks like the |
@alangenfeld Attempting to try/catch it |
dismissing due to substantive changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be preferable to capture this in a shared utility function ie non_secure_md5
and use that at each callsite.
The implementation oft this function could instead check the python version to decide whether to use the usedforsecurity
kwarg instead of resorting to try catch as well.
@alangenfeld , perhaps I'm unaware of how you implement utility functions but 3 of the instances of the usage of these md5 calls are in complete separate packages (i.e., dagster core, dagster-dbt, and dagster-k8s). Is there a way to share a function among them all? Or do I need to add a new function, as you described, to each package? |
|
nice, i think this is close. Ran the test suite https://buildkite.com/dagster/unit-tests/builds/245
|
@alangenfeld Thanks for the feedback. I think I've incorporated most of that feedback. Happy to take further feedback. |
unblocked builds again - this should be good to go once the build is green |
Looks like |
will just have to use For the other callsite, I think you could either
|
Alright, made those changes. Should be good to go 🤞 |
Pyright again doesn't like something, but this time it doesn't appear to be related to the things I have changed, except that it's in the same file as one of my changes ... |
Hmm thats frustrating. Why don't you rebase/merge to resolve the conflict and if the weird error is still present i will just merge it and deal with it. |
done. |
thanks for the PR! |
## Summary & Motivation For FIPS enabled systems the MD5 function is disabled in `openssl`. Since Dagster is using `hashlib.md5` in various locations (`dagster`, `dagster-dbt`, and `dagster-k8s`), on a FIPS enabled environment the UI will deliver the following error when trying to load the code location: ``` ValueError: [digital envelope routines: EVP_DigestInit_ex] disabled for FIPS File "/opt/dagster/app/venv/lib/python3.11/site-packages/dagster/_grpc/server.py", line 609, in _get_serialized_external_repository_data external_repository_data_from_def( File "/opt/dagster/app/venv/lib/python3.11/site-packages/dagster/_core/host_representation/external_data.py", line 1341, in external_repository_data_from_def asset_graph = external_asset_graph_from_defs( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/dagster/app/venv/lib/python3.11/site-packages/dagster/_core/host_representation/external_data.py", line 1531, in external_asset_graph_from_defs atomic_execution_unit_id = assets_def.unique_id ^^^^^^^^^^^^^^^^^^^^ File "/opt/dagster/app/venv/lib/python3.11/site-packages/dagster/_core/definitions/assets.py", line 1254, in unique_id return hashlib.md5((json.dumps(sorted(self.keys))).encode("utf-8")).hexdigest() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``` A web search [indicates](s3tools/s3cmd#1005) that flagging such `hashlib.md5` uses with the `usedforsecurity=False` parameter will resolve this error. As far as I can ascertain, each of the modified usages are indeed NOT used for the security of the md5 algorithm but instead used to determine the uniqueness of the item(s) being hashed. If this is not the case, my PR will need to be corrected. ## How I Tested These Changes I have deployed these changes on my own companies FIPS-enabled, k8s-based systems and seen the error resolved.
Summary & Motivation
For FIPS enabled systems the MD5 function is disabled in
openssl
. Since Dagster is usinghashlib.md5
in various locations (dagster
,dagster-dbt
, anddagster-k8s
), on a FIPS enabled environment the UI will deliver the following error when trying to load the code location:A web search indicates that flagging such
hashlib.md5
uses with theusedforsecurity=False
parameter will resolve this error. As far as I can ascertain, each of the modified usages are indeed NOT used for the security of the md5 algorithm but instead used to determine the uniqueness of the item(s) being hashed. If this is not the case, my PR will need to be corrected.How I Tested These Changes
I have deployed these changes on my own companies FIPS-enabled, k8s-based systems and seen the error resolved.