Skip to content

dgoldenberg-audiomack/tf_issue_237

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

tf_issue_237

This repository contains all the code necessary to reproduce the following issue: https://stackoverflow.com/questions/66389284/what-does-this-error-mean-and-how-to-troubleshoot-it-valueerror-shape-must-b

Pre-requisites:

  • python 3, e.g. I'm using Python 3.7.3
  • Spark 3.0.1
  • tensorflow==2.4.0
  • tensorflow_recommenders==v0.4.0
  • tensorflow-io-nightly==0.17.0.dev20210208174016
  • Note on TF IO:
    • I started with tensorflow_io==0.17.0, that yielded: tensorflow/io#1254
    • I then ported to tensorflow-io-nightly which contains the fix for that issue
    • Since then, however, I've come across this: tensorflow/io#1313
    • Per that ticket, it was recommended to me to use tensorflow-io-nightly=<earlier than 2021/02/10>.
    • Therefore, I chose tensorflow-io-nightly==0.17.0.dev20210208174016

To execute code to reproduce the issue:

$ python recsys_tf_237/recsystf_driver.py

The code:

  • recsystf_driver.py - the Spark driver to execute
  • recsystf_model.py - the code exercising TF recommenders

The input data:

  • input-data/events - contains Parquet with event (interactions) data (user id's with item id's)
  • input-data/items - contains CSV data with item ID's
  • input-data/users - contains CSV data with user ID's

The full output with the stack trace I'm seeing:

21/02/26 20:07:12 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).

>> Running the TFRS based recommender...
>> Loading TF datasets...
>> Loading the ITEMS dataset from ./input-data/items...
2021-02-26 20:07:13.702327: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-02-26 20:07:13.702606: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
>> ITEMS dataset: loaded.
>> Loading the EVENTS dataset from ./input-data/events...
>> Events filepaths: ['./input-data/events/events-001.snappy.parquet']
>> Loading ./input-data/events/events-001.snappy.parquet for events...
2021-02-26 20:07:13.780001: I tensorflow_io/core/kernels/cpu_check.cc:128] Your CPU supports instructions that this TensorFlow IO binary was not compiled to use: AVX2 FMA
>> EVENTS dataset: loaded
>> Loading TF datasets: done.
>> Preparing data...
>> Data preparation: done.
WARNING:tensorflow:There are non-GPU devices in `tf.distribute.Strategy`, not using nccl allreduce.
>> Number of devices: 1
>> Training the model...
2021-02-26 20:07:27.259021: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:116] None of the MLIR optimization passes are enabled (registered 2)
Epoch 1/3
WARNING:tensorflow:Model was constructed with shape (None, 1000) for input KerasTensor(type_spec=TensorSpec(shape=(None, 1000), dtype=tf.string, name='string_lookup_1_input'), name='string_lookup_1_input', description="created by layer 'string_lookup_1_input'"), but it was called on an input with incompatible shape (None,).
Traceback (most recent call last):
  File "recsys_tf_237/recsystf_driver.py", line 54, in <module>
    main(sys.argv)
  File "recsys_tf_237/recsystf_driver.py", line 38, in main
    model_maker.train_and_evaluate(model, NUM_TRAIN_EPOCHS)
  File "/Users/dgoldenberg/work/temp2/tf_issue_237/recsys_tf_237/recsystf_model.py", line 179, in train_and_evaluate
    model.fit(model.cached_train_event_ds, epochs=num_epochs)
  File "/Users/dgoldenberg/.pyenv/versions/3.7.3/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 1100, in fit
    tmp_logs = self.train_function(iterator)
  File "/Users/dgoldenberg/.pyenv/versions/3.7.3/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 828, in __call__
    result = self._call(*args, **kwds)
  File "/Users/dgoldenberg/.pyenv/versions/3.7.3/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 871, in _call
    self._initialize(args, kwds, add_initializers_to=initializers)
  File "/Users/dgoldenberg/.pyenv/versions/3.7.3/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 726, in _initialize
    *args, **kwds))
  File "/Users/dgoldenberg/.pyenv/versions/3.7.3/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 2969, in _get_concrete_function_internal_garbage_collected
    graph_function, _ = self._maybe_define_function(args, kwargs)
  File "/Users/dgoldenberg/.pyenv/versions/3.7.3/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 3361, in _maybe_define_function
    graph_function = self._create_graph_function(args, kwargs)
  File "/Users/dgoldenberg/.pyenv/versions/3.7.3/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 3206, in _create_graph_function
    capture_by_value=self._capture_by_value),
  File "/Users/dgoldenberg/.pyenv/versions/3.7.3/lib/python3.7/site-packages/tensorflow/python/framework/func_graph.py", line 990, in func_graph_from_py_func
    func_outputs = python_func(*func_args, **func_kwargs)
  File "/Users/dgoldenberg/.pyenv/versions/3.7.3/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 634, in wrapped_fn
    out = weak_wrapped_fn().__wrapped__(*args, **kwds)
  File "/Users/dgoldenberg/.pyenv/versions/3.7.3/lib/python3.7/site-packages/tensorflow/python/framework/func_graph.py", line 977, in wrapper
    raise e.ag_error_metadata.to_exception(e)
ValueError: in user code:

    /Users/dgoldenberg/.pyenv/versions/3.7.3/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py:805 train_function  *
        return step_function(self, iterator)
    /Users/dgoldenberg/.pyenv/versions/3.7.3/lib/python3.7/site-packages/tensorflow_recommenders/tasks/retrieval.py:157 call  *
        update_op = self._factorized_metrics.update_state(query_embeddings,
    /Users/dgoldenberg/.pyenv/versions/3.7.3/lib/python3.7/site-packages/tensorflow_recommenders/metrics/factorized_top_k.py:83 update_state  *
        top_k_predictions, _ = self._candidates(query_embeddings, k=self._k)
    /Users/dgoldenberg/.pyenv/versions/3.7.3/lib/python3.7/site-packages/tensorflow_recommenders/layers/factorized_top_k.py:224 top_k  *
        joined_scores = tf.concat([state_scores, x_scores], axis=1)
    /Users/dgoldenberg/.pyenv/versions/3.7.3/lib/python3.7/site-packages/tensorflow/python/util/dispatch.py:201 wrapper  **
        return target(*args, **kwargs)
    /Users/dgoldenberg/.pyenv/versions/3.7.3/lib/python3.7/site-packages/tensorflow/python/ops/array_ops.py:1677 concat
        return gen_array_ops.concat_v2(values=values, axis=axis, name=name)
    /Users/dgoldenberg/.pyenv/versions/3.7.3/lib/python3.7/site-packages/tensorflow/python/ops/gen_array_ops.py:1208 concat_v2
        "ConcatV2", values=values, axis=axis, name=name)
    /Users/dgoldenberg/.pyenv/versions/3.7.3/lib/python3.7/site-packages/tensorflow/python/framework/op_def_library.py:750 _apply_op_helper
        attrs=attr_protos, op_def=op_def)
    /Users/dgoldenberg/.pyenv/versions/3.7.3/lib/python3.7/site-packages/tensorflow/python/framework/func_graph.py:592 _create_op_internal
        compute_device)
    /Users/dgoldenberg/.pyenv/versions/3.7.3/lib/python3.7/site-packages/tensorflow/python/framework/ops.py:3536 _create_op_internal
        op_def=op_def)
    /Users/dgoldenberg/.pyenv/versions/3.7.3/lib/python3.7/site-packages/tensorflow/python/framework/ops.py:2016 __init__
        control_input_ops, op_def)
    /Users/dgoldenberg/.pyenv/versions/3.7.3/lib/python3.7/site-packages/tensorflow/python/framework/ops.py:1856 _create_c_op
        raise ValueError(str(e))

    ValueError: Shape must be rank 2 but is rank 3 for '{{node concat}} = ConcatV2[N=2, T=DT_FLOAT, Tidx=DT_INT32](args_0, args_2, concat/axis)' with input shapes: [?,0], [?,?,?], [].

Without using the tf.data.experimental.AutoShardPolicy.OFF, the error is similar but https://stackoverflow.com/questions/65322700/tensorflow-keras-consider-either-turning-off-auto-sharding-or-switching-the-a recommends disabling auto-sharding.

21/02/27 01:24:04 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
>> Loading TF datasets...
>> Loading the ITEMS dataset from ./input-data/items...
2021-02-27 01:24:06.223795: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-02-27 01:24:06.224062: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
>> ITEMS dataset: loaded.
>> Loading the EVENTS dataset from ./input-data/events...
>> Events filepaths: ['./input-data/events/events.snappy.parquet']
>> Loading ./input-data/events/events.snappy.parquet for events...
2021-02-27 01:24:06.287886: I tensorflow_io/core/kernels/cpu_check.cc:128] Your CPU supports instructions that this TensorFlow IO binary was not compiled to use: AVX2 FMA
>> EVENTS dataset: loaded
>> Loading TF datasets: done.
>> Preparing data...
>> Data preparation: done.
WARNING:tensorflow:There are non-GPU devices in `tf.distribute.Strategy`, not using nccl allreduce.
>> Training the model...
2021-02-27 01:24:19.974113: W tensorflow/core/grappler/optimizers/data/auto_shard.cc:656] In AUTO-mode, and switching to DATA-based sharding, instead of FILE-based sharding as we cannot find appropriate reader dataset op(s) to shard. Error: Did not find a shardable source, walked to a node which is not a dataset: name: "UnbatchDataset/_17"
op: "UnbatchDataset"
input: "MapDataset/_16"
attr {
  key: "output_shapes"
  value {
    list {
      shape {
      }
    }
  }
}
attr {
  key: "output_types"
  value {
    list {
      type: DT_STRING
    }
  }
}
. Consider either turning off auto-sharding or switching the auto_shard_policy to DATA to shard this dataset. You can do this by creating a new `tf.data.Options()` object then setting `options.experimental_distribute.auto_shard_policy = AutoShardPolicy.DATA` before applying the options object to the dataset via `dataset.with_options(options)`.
2021-02-27 01:24:19.995799: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:116] None of the MLIR optimization passes are enabled (registered 2)
Epoch 1/3
WARNING:tensorflow:Model was constructed with shape (None, 1000) for input KerasTensor(type_spec=TensorSpec(shape=(None, 1000), dtype=tf.string, name='string_lookup_1_input'), name='string_lookup_1_input', description="created by layer 'string_lookup_1_input'"), but it was called on an input with incompatible shape (None,).
Traceback (most recent call last):
  File "recsys_tf_237/recsystf_driver.py", line 37, in <module>
    main(sys.argv)
  File "recsys_tf_237/recsystf_driver.py", line 29, in main
    model_maker.train_and_evaluate(model, NUM_TRAIN_EPOCHS)
  File "/Users/dgoldenberg/work/audiomack-code/code/tf_issue_237/recsys_tf_237/recsystf_model.py", line 111, in train_and_evaluate
    model.fit(model.cached_train_event_ds, epochs=num_epochs)
  File "/Users/dgoldenberg/.pyenv/versions/3.7.3/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 1100, in fit
    tmp_logs = self.train_function(iterator)
  File "/Users/dgoldenberg/.pyenv/versions/3.7.3/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 828, in __call__
    result = self._call(*args, **kwds)
  File "/Users/dgoldenberg/.pyenv/versions/3.7.3/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 871, in _call
    self._initialize(args, kwds, add_initializers_to=initializers)
  File "/Users/dgoldenberg/.pyenv/versions/3.7.3/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 726, in _initialize
    *args, **kwds))
  File "/Users/dgoldenberg/.pyenv/versions/3.7.3/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 2969, in _get_concrete_function_internal_garbage_collected
    graph_function, _ = self._maybe_define_function(args, kwargs)
  File "/Users/dgoldenberg/.pyenv/versions/3.7.3/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 3361, in _maybe_define_function
    graph_function = self._create_graph_function(args, kwargs)
  File "/Users/dgoldenberg/.pyenv/versions/3.7.3/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 3206, in _create_graph_function
    capture_by_value=self._capture_by_value),
  File "/Users/dgoldenberg/.pyenv/versions/3.7.3/lib/python3.7/site-packages/tensorflow/python/framework/func_graph.py", line 990, in func_graph_from_py_func
    func_outputs = python_func(*func_args, **func_kwargs)
  File "/Users/dgoldenberg/.pyenv/versions/3.7.3/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 634, in wrapped_fn
    out = weak_wrapped_fn().__wrapped__(*args, **kwds)
  File "/Users/dgoldenberg/.pyenv/versions/3.7.3/lib/python3.7/site-packages/tensorflow/python/framework/func_graph.py", line 977, in wrapper
    raise e.ag_error_metadata.to_exception(e)
ValueError: in user code:

    /Users/dgoldenberg/.pyenv/versions/3.7.3/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py:805 train_function  *
        return step_function(self, iterator)
    /Users/dgoldenberg/.pyenv/versions/3.7.3/lib/python3.7/site-packages/tensorflow_recommenders/tasks/retrieval.py:157 call  *
        update_op = self._factorized_metrics.update_state(query_embeddings,
    /Users/dgoldenberg/.pyenv/versions/3.7.3/lib/python3.7/site-packages/tensorflow_recommenders/metrics/factorized_top_k.py:83 update_state  *
        top_k_predictions, _ = self._candidates(query_embeddings, k=self._k)
    /Users/dgoldenberg/.pyenv/versions/3.7.3/lib/python3.7/site-packages/tensorflow_recommenders/layers/factorized_top_k.py:224 top_k  *
        joined_scores = tf.concat([state_scores, x_scores], axis=1)
    /Users/dgoldenberg/.pyenv/versions/3.7.3/lib/python3.7/site-packages/tensorflow/python/util/dispatch.py:201 wrapper  **
        return target(*args, **kwargs)
    /Users/dgoldenberg/.pyenv/versions/3.7.3/lib/python3.7/site-packages/tensorflow/python/ops/array_ops.py:1677 concat
        return gen_array_ops.concat_v2(values=values, axis=axis, name=name)
    /Users/dgoldenberg/.pyenv/versions/3.7.3/lib/python3.7/site-packages/tensorflow/python/ops/gen_array_ops.py:1208 concat_v2
        "ConcatV2", values=values, axis=axis, name=name)
    /Users/dgoldenberg/.pyenv/versions/3.7.3/lib/python3.7/site-packages/tensorflow/python/framework/op_def_library.py:750 _apply_op_helper
        attrs=attr_protos, op_def=op_def)
    /Users/dgoldenberg/.pyenv/versions/3.7.3/lib/python3.7/site-packages/tensorflow/python/framework/func_graph.py:592 _create_op_internal
        compute_device)
    /Users/dgoldenberg/.pyenv/versions/3.7.3/lib/python3.7/site-packages/tensorflow/python/framework/ops.py:3536 _create_op_internal
        op_def=op_def)
    /Users/dgoldenberg/.pyenv/versions/3.7.3/lib/python3.7/site-packages/tensorflow/python/framework/ops.py:2016 __init__
        control_input_ops, op_def)
    /Users/dgoldenberg/.pyenv/versions/3.7.3/lib/python3.7/site-packages/tensorflow/python/framework/ops.py:1856 _create_c_op
        raise ValueError(str(e))

    ValueError: Shape must be rank 2 but is rank 3 for '{{node concat}} = ConcatV2[N=2, T=DT_FLOAT, Tidx=DT_INT32](args_0, args_2, concat/axis)' with input shapes: [?,0], [?,?,?], [].

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages