Q-Network wrong output spec #896

rissois · 2023-11-13T09:06:30Z

I am receiving the following error: Expected q_network to emit a floating point tensor with inner dims (464,); but saw network output spec: TensorSpec(shape=(6, 4, 464), dtype=tf.float32, name=None)

I am building a custom environment for DqnAgent with an observation shape of (6,4,4). The action is scalar (I would have liked a (2,), but apparently that's not possible at the moment. I am following this tutorial as closely as I can for my use case.

The environment class is initialized with:

self._action_spec = array_spec.BoundedArraySpec(
    shape=(), dtype=np.int32, minimum=0, maximum=463, name='action'
)

# Six 4x4 boards
self._observation_spec = array_spec.BoundedArraySpec(
    (6, 4, 4), np.int32,
    minimum=self.createMinMaxBoards([0, 0, 0, 0, 0, -1]),
    maximum=self.createMinMaxBoards([1, 1, 1, 1, 3, 2]),
)

I was able to successfully validate the environment and run the environment with a fixed policy, as per the tutorial, so the environment itself is in good shape. I then jumped over to this tutorial to add the agent and copy and pasted those two blocks of code directly:

fc_layer_params = (100, 50)
action_tensor_spec = tensor_spec.from_spec(env.action_spec())
num_actions = action_tensor_spec.maximum - action_tensor_spec.minimum + 1

# Define a helper function to create Dense layers configured with the right
# activation and kernel initializer.
def dense_layer(num_units):
  return tf.keras.layers.Dense(
      num_units,
      activation=tf.keras.activations.relu,
      kernel_initializer=tf.keras.initializers.VarianceScaling(
          scale=2.0, mode='fan_in', distribution='truncated_normal'))

# QNetwork consists of a sequence of Dense layers followed by a dense layer
# with `num_actions` units to generate one q_value per available action as
# its output.
dense_layers = [dense_layer(num_units) for num_units in fc_layer_params]
q_values_layer = tf.keras.layers.Dense(
    num_actions,
    activation=None,
    kernel_initializer=tf.keras.initializers.RandomUniform(
        minval=-0.03, maxval=0.03),
    bias_initializer=tf.keras.initializers.Constant(-0.2))
q_net = sequential.Sequential(dense_layers + [q_values_layer])

optimizer = tf.keras.optimizers.Adam(learning_rate=learning_rate)

train_step_counter = tf.Variable(0)

agent = dqn_agent.DqnAgent(
    train_env.time_step_spec(),
    train_env.action_spec(),
    q_network=q_net,
    optimizer=optimizer,
    td_errors_loss_fn=common.element_wise_squared_loss,
    train_step_counter=train_step_counter)

agent.initialize()

The error is thrown at agent = dqn_agent.DqnAgent(...). There is a line in dqn_agent.py: q_network.create_variables(net_observation_spec) which seems to create the (6,4,464) shape. I would have imagined the network output would automatically be adopted from q_values_layer num_actions. More then likely this is a failure on my end, but I have seen unresolved posts on StackOverflow. Can anyone please help correct my understanding / code here?

The text was updated successfully, but these errors were encountered:

LokeshNEU747 · 2023-12-04T16:44:43Z

Even I'm facing the same issue. Have you resolved the issue?

majic-ivan · 2024-11-11T22:25:43Z

I don't know if this solution would work for everyone else but you can play around with your observation spec in your environment. My shape was (1,2) and it would spit out the error like in this query and (2,) worked!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Q-Network wrong output spec #896

Q-Network wrong output spec #896

rissois commented Nov 13, 2023 •

edited

Loading

LokeshNEU747 commented Dec 4, 2023

majic-ivan commented Nov 11, 2024

Q-Network wrong output spec #896

Q-Network wrong output spec #896

Comments

rissois commented Nov 13, 2023 • edited Loading

LokeshNEU747 commented Dec 4, 2023

majic-ivan commented Nov 11, 2024

rissois commented Nov 13, 2023 •

edited

Loading