-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Python SDK sometimes crashes in streaming jobs running on 2.47.0+ SDK #27330
Comments
We suspect the issue affects only Python 3.10 SDK. Immediate workaround may be to use Python 3.9 or Python 3.11. We are working on confirming the fix. |
We have experienced the same issue in loop for around 5 minutes in Python 3.11 in Dataflow and then it got solved automatically:
|
@victorrgez Thank you for the feedback. The attribute error issues should we fixed once we upgrade to the upcoming release of protobuf library (more details on protobuf issues in #28246). Python 3.10-specific crashes should be fixed with #28355. |
Any update on this P1? |
we expect it to be resolved in 2.51.0. |
There is some evidence that this issue is not fully resolved, so I am reopening it and will look closer to see if I can repro it reliably for further investigation. |
We expect this issue to be resolved in 2.53.0. |
What happened?
We suspect that an upgrade to
protobuf==4.x.x
in Beam SDK & worker containers (#24599) introduced a failure mode in Python streaming pipelines, where Python process sometimes crashes withAttributeError
messages , segmentation faults and in some cases causes pipeline stuckness. We expect this to be resolved in Beam 2.53.0.Batch pipelines should not be affected.
Mitigations:
Use apache-beam==2.53.0 or above (once released), OR
Use apache-beam==2.46.0 or below, OR
Install protobuf 3.x in the submission and runtime environment. For example, you can use a
--requirements_file
pipeline option with a file that includes:OR
If you must use protobuf 4.x, use a python implementation of protobuf by setting a
PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python
environment variable in the runtime environment. This might degrade the performance since python implementation is less efficient. For example, you could create a custom Beam SDK container from a Dockerfile that looks like the following:Example errors:
The pipelines usually recover after the process crash but may cause delays or pipeline stuckness.
Issue Priority
Priority: 1 (data loss / total loss of function)
Issue Components
The text was updated successfully, but these errors were encountered: