Compatibility with the Newest Overcooked-AI Env #4

muzhancun · 2024-01-14T06:03:29Z

I am trying to pair your reproduced baselines (downloaded from Google Drive) with my own human proxy model trained in the new Overcooked-AI environment (with old dynamics).
To make them compatible I first use the old lossless_state_encoding function for the baselines (only for the baselines model because my HP is trained in the new env).
Also, the env requires tensorflow 2 but your models are trained using tensorflow 1, so I load your models through the following code:

def get_model_policy_from_saved_model(save_dir, sim_threads=30):
    """Get a policy function from a saved model"""
    predictor = tf.saved_model.load(save_dir)
    step_fn = lambda obs: predictor.signatures["serving_default"](tf.convert_to_tensor(obs, dtype=tf.float32))["action_probs"]
    return get_model_policy(step_fn, sim_threads)

However this would raise warnings like:

WARNING:tensorflow:Unable to create a python object for variable <tf.Variable 'agent0/ppo2_model/pi/conv_0/kernel:0' shape=(3, 3, 25, 25) dtype=float32_ref> because it is a reference variable. It may not be visible to training APIs. If this is a problem, consider rebuilding the SavedModel after running tf.compat.v1.enable_resource_variables().

The code to pair the MEP baseline with my human proxy model is as follows:

def evaluate_hp_mep(hp_model_path, mep_model_path, layout, order=0):
    hp_model, hp_params = load_bc_model(hp_model_path)
    hp_policy = BehaviorCloningPolicy.from_model(
        hp_model, hp_params, stochastic=True
    )
    # print(hp_params)
    base_ae = _get_base_ae(hp_params)
    base_env = base_ae.env
    hp_agent = RlLibAgent(hp_policy, order, base_env.featurize_state_mdp)

    mep_agent = get_agent_from_saved_model(mep_model_path, sim_threads=30)

    ae = AgentEvaluator.from_layout_name(
        mdp_params={"layout_name": layout, "old_dynamics": True},
        env_params={"horizon": 400},
    )

    if order == 0:
        ap = AgentPair(hp_agent, mep_agent)
    else:
        ap = AgentPair(mep_agent, hp_agent)
    result = ae.evaluate_agent_pair(ap, 5, 400)
    return result, result["ep_returns"]

But the evaluation results are quite unsatisfying. Am I missing some steps or the warnings do matter?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compatibility with the Newest Overcooked-AI Env #4

Compatibility with the Newest Overcooked-AI Env #4

muzhancun commented Jan 14, 2024

Compatibility with the Newest Overcooked-AI Env #4

Compatibility with the Newest Overcooked-AI Env #4

Comments

muzhancun commented Jan 14, 2024