-
Notifications
You must be signed in to change notification settings - Fork 16.4k
Refactor serialization #28067
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor serialization #28067
Conversation
airflow/utils/module_loading.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah this is needed for things like below:
In [1]: import numpy
In [2]: numpy.int64(2)
Out[2]: 2
In [3]: type(_2)
Out[3]: numpy.int64
In [4]: type(_2).__qualname__
Out[4]: 'int64'
In [5]: type(_2).__module__
Out[5]: 'numpy'
In [6]: _2.__qualname__
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Input In [6], in <cell line: 1>()
----> 1 _2.__qualname__
AttributeError: 'numpy.int64' object has no attribute '__qualname__'
In [7]: _2.__module__
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Input In [7], in <cell line: 1>()
----> 1 _2.__module__
AttributeError: 'numpy.int64' object has no attribute '__module__'
airflow/utils/serialization.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we try to get it from version too as a fallback? as that was shipped with 2.5?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| VERSION: getattr(cls, "__version__", DEFAULT_VERSION), | |
| VERSION: getattr(cls, "__version__", getattr(cls, "version", DEFAULT_VERSION)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it was shipped, but nowhere implemented, so only custom classes after that could use this. Some of the providers collide with version (e.g. ggoogle) so we are obtaining the wrong version then. So I'd prefer to take the small hit and this is PEP 8/480 compatible.
airflow/utils/serialization.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is only really used by SerializedDAG, I think it’s more productive to simply have deserialize(...) -> Any and add a separate deserialize_dag(...) -> SerializedDAG that simply calls deserialize internally and provide a better return type hint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not only used by SerializedDAG (why do you think so?) it will be used by XCom too (which is even the first implementation). deserialize_dag will be removed as it is non-standard.
6b92ce0 to
97b0f8c
Compare
Seralization and deserialization (derser) of objects that were not under the control of Airflow of the dag author was not possible. This implementation allows for extension of serialization of arbitrary objects by putting them in the namespace of airflow. Selection of serializer happens in order of registered serializer, custom serializer, attr/dataclass annottated object.
613bcc8 to
326f227
Compare
Co-authored-by: Kaxil Naik <kaxilnaik@gmail.com>
This refactors serialization so it is now modular and allows for extending. It implements versioning and allows namespace packages to be added to allow custom serializers for third-party objects.
This should replace all other serializers (eventually, only DAG remains).
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rstor{issue_number}.significant.rst, in newsfragments.