Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error of the Stackoverflow Tokernizer example #301

Open
WilliamYi96 opened this issue May 6, 2023 · 1 comment
Open

Error of the Stackoverflow Tokernizer example #301

WilliamYi96 opened this issue May 6, 2023 · 1 comment

Comments

@WilliamYi96
Copy link

TensorFlow version: 2.5.3
fedjax version: 0.0.16
jax version: 0.4.8

When I follow the docs (https://fedjax.readthedocs.io/en/latest/fedjax.datasets.html#fedjax.datasets.stackoverflow.load_data) to process the Stackoverflow dataset by using

from fedjax.datasets import stackoverflow
# Load partially preprocessed splits.
train, held_out, test = stackoverflow.load_data(cache_dir='../data')
# Apply tokenizer during batching.
Tokenizer = stackoverflow.StackoverflowTokenizer()
train_max_length, eval_max_length = 20, 30
train_for_train = train.preprocess_batch(
    tokenizer.as_preprocess_batch(train_max_length))
train_for_eval = train.preprocess_batch(
    tokenizer.as_preprocess_batch(eval_max_length))

It has the following error:

2023-05-06 23:46:33.460149: W tensorflow/core/platform/cloud/google_auth_provider.cc:184] All attempts to get a Google authentication bearer token failed, returning an empty token. Retrieving token from files failed with "Not found: Could not locate the credentials file.". Retrieving token from GCE failed with "Failed precondition: Error executing an HTTP request: libcurl code 6 meaning 'Couldn't resolve host name', error details: Could not resolve host: metadata".
Traceback (most recent call last):
  File "test.py", line 26, in <module>
    tokenizer = stackoverflow.StackoverflowTokenizer()
  File "/home/yik/anaconda2/envs/fl/lib/python3.8/site-packages/fedjax/datasets/stackoverflow.py", line 185, in __init__
    self._table = tf.lookup.StaticVocabularyTable(
  File "/home/yik/anaconda2/envs/fl/lib/python3.8/site-packages/tensorflow/python/ops/lookup_ops.py", line 1255, in __init__
    raise TypeError("Invalid key dtype, expected one of %s, but got %s." %
TypeError: Invalid key dtype, expected one of (tf.int64, tf.string), but got <dtype: 'float32'>.
Exception ignored in: <function CapturableResource.__del__ at 0x2b2156f4c040>
Traceback (most recent call last):
  File "/home/yik/anaconda2/envs/fl/lib/python3.8/site-packages/tensorflow/python/training/tracking/tracking.py", line 269, in __del__
    with self._destruction_context():
AttributeError: 'StaticVocabularyTable' object has no attribute '_destruction_context'

Could you please help fix this?

@kho
Copy link
Collaborator

kho commented May 19, 2023

Sorry for the late response. Somehow the email notification slipped through all team members' inboxes.

I cannot reproduce the problem on TensorFlow 2.5.3 installed from pip. Could you tell us how you installed the packages?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants