Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TensorBoard's generated pb2.py files are incompatible with protobuf 4.21.0 #5703

Closed
czxttkl opened this issue May 12, 2022 · 12 comments
Closed

Comments

@czxttkl
Copy link

czxttkl commented May 12, 2022

Edited by @nfelt for context: The original title for this issue was AttributeError: module 'google._upb._message' has no attribute 'Message' because that was the original error message with protobuf 4.21.0rc1. The final release protobuf 4.21.0 has a different error message, TypeError: Descriptors cannot not be created directly..

Current status:


Our open source tests have recently encountered the following error which comes from tensorboard (or maybe protobuf):

...
    from torch.utils.tensorboard import SummaryWriter
.tox/circleci_gym_gpu_unittest/lib/python3.8/site-packages/torch/utils/tensorboard/__init__.py:10: in <module>
    from .writer import FileWriter, SummaryWriter  # noqa: F401
.tox/circleci_gym_gpu_unittest/lib/python3.8/site-packages/torch/utils/tensorboard/writer.py:9: in <module>
    from tensorboard.compat.proto.event_pb2 import SessionLog
.tox/circleci_gym_gpu_unittest/lib/python3.8/site-packages/tensorboard/compat/proto/event_pb2.py:17: in <module>
    from tensorboard.compat.proto import summary_pb2 as tensorboard_dot_compat_dot_proto_dot_summary__pb2
.tox/circleci_gym_gpu_unittest/lib/python3.8/site-packages/tensorboard/compat/proto/summary_pb2.py:17: in <module>
    from tensorboard.compat.proto import tensor_pb2 as tensorboard_dot_compat_dot_proto_dot_tensor__pb2
.tox/circleci_gym_gpu_unittest/lib/python3.8/site-packages/tensorboard/compat/proto/tensor_pb2.py:16: in <module>
    from tensorboard.compat.proto import resource_handle_pb2 as tensorboard_dot_compat_dot_proto_dot_resource__handle__pb2
.tox/circleci_gym_gpu_unittest/lib/python3.8/site-packages/tensorboard/compat/proto/resource_handle_pb2.py:16: in <module>
    from tensorboard.compat.proto import tensor_shape_pb2 as tensorboard_dot_compat_dot_proto_dot_tensor__shape__pb2
.tox/circleci_gym_gpu_unittest/lib/python3.8/site-packages/tensorboard/compat/proto/tensor_shape_pb2.py:36: in <module>
    _descriptor.FieldDescriptor(
.tox/circleci_gym_gpu_unittest/lib/python3.8/site-packages/google/protobuf/descriptor.py:560: in __new__
    _message.Message._CheckCalledFromGeneratedFile()
E   AttributeError: module 'google._upb._message' has no attribute 'Message'

Our pip install package list is as follows:

circleci_gym_gpu_unittest installed: absl-py==1.0.0,aiohttp==4.0.0a1,arrow==1.2.2,async-timeout==3.0.1,atari-py==0.2.9,attrs==21.4.0,box2d-py==2.3.8,cachetools==5.0.0,certifi==2021.10.8,chardet==3.0.4,charset-normalizer==2.0.12,click==8.1.3,cloudpickle==1.2.2,cmake==3.22.4,coverage==6.3.2,Cython==3.0.0a10,dill==0.3.4,diskcache==5.4.0,distro==1.7.0,docker==5.0.3,docstring-parser==0.8.1,execnet==1.9.0,filelock==3.6.0,findspark==2.0.1,fsspec==2022.3.0,future==0.18.2,gin-config==0.5.0,google-auth==2.6.6,google-auth-oauthlib==0.4.6,grpcio==1.46.1,gym==0.17.2,gym-minigrid==1.0.3,hypothesis==6.46.3,idna==3.3,importlib-metadata==4.11.3,iniconfig==1.1.1,iopath==0.1.9,Jinja2==3.1.2,joblib==1.1.0,linecache2==1.0.0,Markdown==3.3.7,MarkupSafe==2.1.1,multidict==4.7.6,mypy-extensions==0.4.3,ninja==1.10.2.3,numpy==1.22.3,oauthlib==3.2.0,opencv-python==4.5.5.64,packaging==21.3,pandas==1.4.2,parameterized==0.8.1,petastorm==0.11.4,Pillow==9.1.0,pluggy==1.0.0,portalocker==2.4.0,protobuf==4.21.0rc1,psutil==5.9.0,py==1.11.0,py4j==0.10.9,pyarrow==8.0.0,pyasn1==0.4.8,pyasn1-modules==0.2.8,pydantic==1.6.2,pyDeprecate==0.3.2,pyglet==1.5.0,pyparsing==3.0.9,pyre-extensions==0.0.27,pyspark==3.1.1,pytest==7.1.2,pytest-cov==3.0.0,pytest-forked==1.4.0,pytest-xdist==2.5.0,python-dateutil==2.8.2,pytorch-lightning @ git+https://github.com/PyTorchLightning/pytorch-lightning@9b011606f354ab6afa4135cc8bfe1339a06b3aeb,pytz==2022.1,PyYAML==6.0,pyzmq==23.0.0b2,reagent @ file:///home/circleci/project/.tox/.tmp/package/1/reagent-0.1.zip,recsim-no-tf==0.2.3,requests==2.27.1,requests-oauthlib==1.3.1,rsa==4.8,ruamel.yaml==0.17.21,ruamel.yaml.clib==0.2.6,scikit-build==0.14.1,scikit-learn==1.1.0rc1,scipy==1.8.0,six==1.16.0,sortedcontainers==2.4.0,spark-testing-base==0.10.0,tabulate==0.8.9,tensorboard==2.9.0,tensorboard-data-server==0.6.1,tensorboard-plugin-wit==1.8.1,threadpoolctl==3.1.0,tinydb==4.7.0,tomli==2.0.1,torch==1.11.0+cu113,torchmetrics==0.8.2,torchrec==0.1.0,torchx-nightly==2022.5.11,tqdm==4.64.0,traceback2==1.4.0,typing==3.7.4.3,typing-inspect==0.7.1,typing_extensions==4.2.0,unittest2==1.1.0,urllib3==1.26.9,websocket-client==1.3.2,Werkzeug==2.1.2,yarl==1.7.2,zipp==3.8.0

Can anyone help us investigate if this is a known bug? how do we unblock ourselves? Thanks!

@bileschi
Copy link
Collaborator

Can you verify whether this works with a previous version of TensorBoard? It looks like you are using tensorboard==2.9.0. Can you try tensorboard==2.8.0?

@czxttkl
Copy link
Author

czxttkl commented May 12, 2022

Can you verify whether this works with a previous version of TensorBoard? It looks like you are using tensorboard==2.9.0. Can you try tensorboard==2.8.0?

I only got the chance to verify that the issue was gone if I use tensorboard==2.9.0 and force protobuf to be 3.20.1 (by default it was 4.21.0rc1). According to https://developers.google.com/protocol-buffers/docs/news/2022-05-06, protobuf just updates from 3.20.1 to 4.21.0.

@parthea
Copy link

parthea commented May 12, 2022

The issue may be resolved if you install grpcio-tools>=1.44.0. See protocolbuffers/protobuf#9954

@mrgeorge
Copy link

I saw the same error, and it appeared with both tensorboard 2.8.0 and 2.9.0 with protobuf 4.21.0rc1, but resolved when forcing protobuf to 3.20.1.

@czxttkl
Copy link
Author

czxttkl commented May 13, 2022

The issue may be resolved if you install grpcio-tools>=1.44.0. See protocolbuffers/protobuf#9954

This works.

@nfelt
Copy link
Contributor

nfelt commented May 13, 2022

The more recent updates to protocolbuffers/protobuf#9954 seem to make it pretty clear that it's not actually grpcio-tools that fixes the issue, it's probably just that installing it forced a protobuf downgrade.

The solution is probably that TensorBoard needs to either regenerate our _pb2.py files using protoc 3.19.0+ (unfortunately not that easy; Googlers, see b/219030239) or we need to pin our protobuf runtime dep to <4.0.0.

@nfelt
Copy link
Contributor

nfelt commented May 17, 2022

I've escalated to the protobuf folks at protocolbuffers/protobuf#9954 (comment).

I also opened up a larger tracking issue for updating to 3.19.0+ in #5708.

@nfelt nfelt changed the title AttributeError: module 'google._upb._message' has no attribute 'Message' incompatibility with protobuf 4.21.0 (TypeError: Descriptors cannot not be created directly) Jun 2, 2022
@nfelt nfelt changed the title incompatibility with protobuf 4.21.0 (TypeError: Descriptors cannot not be created directly) TensorBoard's generated pb2.py files are incompatible with protobuf 4.21.0 Jun 2, 2022
@nfelt
Copy link
Contributor

nfelt commented Jun 2, 2022

This has a workaround in #5726 that forces the protobuf dep to be <3.20.0.

#5708 tracks a longer-term fix.

@nfelt
Copy link
Contributor

nfelt commented Jun 2, 2022

Originally posted by @saitcakmak in #5708 (comment)


As of protobuf 4.21.1 release, we're getting import errors while importing certain tensorboard components.

Repro:

conda create -n test_tensorboard python=3.7
conda activate test_tensorboard
pip install tensorboard
python
from tensorboard.backend.event_processing import plugin_event_multiplexer as event_multiplexer

Trace:

>>> from tensorboard.backend.event_processing import plugin_event_multiplexer as event_multiplexer
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/saitcakmak/opt/anaconda3/envs/test_tensorboard/lib/python3.7/site-packages/tensorboard/backend/event_processing/plugin_event_multiplexer.py", line 24, in <module>
    from tensorboard.backend.event_processing import (
  File "/Users/saitcakmak/opt/anaconda3/envs/test_tensorboard/lib/python3.7/site-packages/tensorboard/backend/event_processing/plugin_event_accumulator.py", line 23, in <module>
    from tensorboard.backend.event_processing import event_file_loader
  File "/Users/saitcakmak/opt/anaconda3/envs/test_tensorboard/lib/python3.7/site-packages/tensorboard/backend/event_processing/event_file_loader.py", line 20, in <module>
    from tensorboard import data_compat
  File "/Users/saitcakmak/opt/anaconda3/envs/test_tensorboard/lib/python3.7/site-packages/tensorboard/data_compat.py", line 20, in <module>
    from tensorboard.compat.proto import event_pb2
  File "/Users/saitcakmak/opt/anaconda3/envs/test_tensorboard/lib/python3.7/site-packages/tensorboard/compat/proto/event_pb2.py", line 17, in <module>
    from tensorboard.compat.proto import summary_pb2 as tensorboard_dot_compat_dot_proto_dot_summary__pb2
  File "/Users/saitcakmak/opt/anaconda3/envs/test_tensorboard/lib/python3.7/site-packages/tensorboard/compat/proto/summary_pb2.py", line 17, in <module>
    from tensorboard.compat.proto import tensor_pb2 as tensorboard_dot_compat_dot_proto_dot_tensor__pb2
  File "/Users/saitcakmak/opt/anaconda3/envs/test_tensorboard/lib/python3.7/site-packages/tensorboard/compat/proto/tensor_pb2.py", line 16, in <module>
    from tensorboard.compat.proto import resource_handle_pb2 as tensorboard_dot_compat_dot_proto_dot_resource__handle__pb2
  File "/Users/saitcakmak/opt/anaconda3/envs/test_tensorboard/lib/python3.7/site-packages/tensorboard/compat/proto/resource_handle_pb2.py", line 16, in <module>
    from tensorboard.compat.proto import tensor_shape_pb2 as tensorboard_dot_compat_dot_proto_dot_tensor__shape__pb2
  File "/Users/saitcakmak/opt/anaconda3/envs/test_tensorboard/lib/python3.7/site-packages/tensorboard/compat/proto/tensor_shape_pb2.py", line 42, in <module>
    serialized_options=None, file=DESCRIPTOR),
  File "/Users/saitcakmak/opt/anaconda3/envs/test_tensorboard/lib/python3.7/site-packages/google/protobuf/descriptor.py", line 560, in __new__
    _message.Message._CheckCalledFromGeneratedFile()
TypeError: Descriptors cannot not be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:
 1. Downgrade the protobuf package to 3.20.x or lower.
 2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).

More information: https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates
>>> 

nfelt added a commit that referenced this issue Jan 19, 2023
Fixes #5708 and #5703.

This updates our protobuf dependency to 3.19.6 in an attempt to address
#5708, and provide a cleaner solution to #5703.

The choice of 3.19.6 is meant to satisfy two competing constraints:

- Current Python protobuf runtimes (the 4.x series) only support
generated code from protoc versions 3.19.0+, as discussed in
https://protobuf.dev/news/2022-05-06/. As a result, prior to this
change, TensorBoard's pip package had to force its pip package
dependency to `protobuf < 4` to avoid the errors seen in #5703. This PR
lifts that restriction.

- Current TensorFlow is still stuck on protobuf 3.x, the same as we have
been, and as a result pins its pip package dependency using `protobuf <
3.20` (this could presumably be relaxed to `< 4` but that would require
new TF releases). As a result, we must support at least one protobuf
runtime version that also works with TF's constraints.

Our previous attempt at this upgrade (to ~3.18 or so) caused test
failures for Keras (which depends on TB, via TF, for the summary API
code), apparently due to a protobuf runtime that was too old for our
generated code. At the time, this was puzzling because they were
pip-installing a protobuf runtime version that should have been recent
enough - but I suspect now that this was a red herring, and bazel test
was actually getting the protobuf runtime from the protobuf build
dependency, not from the installed Python packages. If we see this
failure mode again, we'll have to get Keras to update the protobuf
Python runtime available in bazel tests.

Lastly, this upgrade lets us clean up some additional issues we had to
work around:

- We can also upgrade gRPC now, to 1.48.2. I selected this version since
it appears to be the most recent version prior to gRPC adopting protobuf
4.x (see
grpc/grpc@41ec08c)
- We can revert the backported fixes to protobuf and grpc from
#5793, since the upgraded
dependencies don't require patching
- We can back out rules_apple reintroduction from
#5561 that we only needed
for gRPC
@nfelt
Copy link
Contributor

nfelt commented Feb 10, 2023

This should be resolved by #6147 which has been released in TensorBoard 2.12.

If you are still seeing this error, please update to TensorBoard 2.12.

@zhanwenchen
Copy link

I had to do this to solve the problem: pip install tensorboard==2.12.0 --user

yatbear pushed a commit to yatbear/tensorboard that referenced this issue Mar 27, 2023
Fixes tensorflow#5708 and tensorflow#5703.

This updates our protobuf dependency to 3.19.6 in an attempt to address
tensorflow#5708, and provide a cleaner solution to tensorflow#5703.

The choice of 3.19.6 is meant to satisfy two competing constraints:

- Current Python protobuf runtimes (the 4.x series) only support
generated code from protoc versions 3.19.0+, as discussed in
https://protobuf.dev/news/2022-05-06/. As a result, prior to this
change, TensorBoard's pip package had to force its pip package
dependency to `protobuf < 4` to avoid the errors seen in tensorflow#5703. This PR
lifts that restriction.

- Current TensorFlow is still stuck on protobuf 3.x, the same as we have
been, and as a result pins its pip package dependency using `protobuf <
3.20` (this could presumably be relaxed to `< 4` but that would require
new TF releases). As a result, we must support at least one protobuf
runtime version that also works with TF's constraints.

Our previous attempt at this upgrade (to ~3.18 or so) caused test
failures for Keras (which depends on TB, via TF, for the summary API
code), apparently due to a protobuf runtime that was too old for our
generated code. At the time, this was puzzling because they were
pip-installing a protobuf runtime version that should have been recent
enough - but I suspect now that this was a red herring, and bazel test
was actually getting the protobuf runtime from the protobuf build
dependency, not from the installed Python packages. If we see this
failure mode again, we'll have to get Keras to update the protobuf
Python runtime available in bazel tests.

Lastly, this upgrade lets us clean up some additional issues we had to
work around:

- We can also upgrade gRPC now, to 1.48.2. I selected this version since
it appears to be the most recent version prior to gRPC adopting protobuf
4.x (see
grpc/grpc@41ec08c)
- We can revert the backported fixes to protobuf and grpc from
tensorflow#5793, since the upgraded
dependencies don't require patching
- We can back out rules_apple reintroduction from
tensorflow#5561 that we only needed
for gRPC
dna2github pushed a commit to dna2fork/tensorboard that referenced this issue May 1, 2023
Fixes tensorflow#5708 and tensorflow#5703.

This updates our protobuf dependency to 3.19.6 in an attempt to address
tensorflow#5708, and provide a cleaner solution to tensorflow#5703.

The choice of 3.19.6 is meant to satisfy two competing constraints:

- Current Python protobuf runtimes (the 4.x series) only support
generated code from protoc versions 3.19.0+, as discussed in
https://protobuf.dev/news/2022-05-06/. As a result, prior to this
change, TensorBoard's pip package had to force its pip package
dependency to `protobuf < 4` to avoid the errors seen in tensorflow#5703. This PR
lifts that restriction.

- Current TensorFlow is still stuck on protobuf 3.x, the same as we have
been, and as a result pins its pip package dependency using `protobuf <
3.20` (this could presumably be relaxed to `< 4` but that would require
new TF releases). As a result, we must support at least one protobuf
runtime version that also works with TF's constraints.

Our previous attempt at this upgrade (to ~3.18 or so) caused test
failures for Keras (which depends on TB, via TF, for the summary API
code), apparently due to a protobuf runtime that was too old for our
generated code. At the time, this was puzzling because they were
pip-installing a protobuf runtime version that should have been recent
enough - but I suspect now that this was a red herring, and bazel test
was actually getting the protobuf runtime from the protobuf build
dependency, not from the installed Python packages. If we see this
failure mode again, we'll have to get Keras to update the protobuf
Python runtime available in bazel tests.

Lastly, this upgrade lets us clean up some additional issues we had to
work around:

- We can also upgrade gRPC now, to 1.48.2. I selected this version since
it appears to be the most recent version prior to gRPC adopting protobuf
4.x (see
grpc/grpc@41ec08c)
- We can revert the backported fixes to protobuf and grpc from
tensorflow#5793, since the upgraded
dependencies don't require patching
- We can back out rules_apple reintroduction from
tensorflow#5561 that we only needed
for gRPC
@yerrick
Copy link

yerrick commented May 1, 2023

pip install -U protobuf==3.20.3
hope this helps you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants