-
Notifications
You must be signed in to change notification settings - Fork 48
Sample One Step Look Ahead Controller (2 Player)
The Sample One Step Look-Ahead controller implements a simple controller that evaluates the states reached within one move from the current state. The controller tries all available actions in the current state (call to advance), and evaluates the states found after applying each one of these actions. The action that took to the state with the best reward is the one that will be executed.
In order to advance the forward model, an array of actions is needed, containing one action for each player (the index in the array corresponding to the player's ID). In this case, the action chosen for the opponent is that returned by the getOppNotLosingAction method, which picks an action, at random, that the opponent would make, assuming the current player does nothing, which wouldn't make it lose the game.
Note that the controller catches the player ID and the number of players in the game in the constructor, then derives the opponent's ID.
From Agent.java:
public class Agent extends AbstractMultiPlayer {
int oppID; //player ID of the opponent
int id; //ID of this player
int no_players; //number of players in the game
public static double epsilon = 1e-6;
public static Random m_rnd;
/**
* initialize all variables for the agent
* @param stateObs Observation of the current state.
* @param elapsedTimer Timer when the action returned is due.
* @param playerID ID if this agent
*/
public Agent(StateObservationMulti stateObs, ElapsedCpuTimer elapsedTimer, int playerID) {
m_rnd = new Random();
//get game information
no_players = stateObs.getNoPlayers();
id = playerID; //player ID of this agent
oppID = (playerID + 1) % stateObs.getNoPlayers();
}
/**
*
* Very simple one step lookahead agent.
* Pass player ID to all state observation methods to query the right player.
* Omitting the player ID will result in it being set to the default 0 (first player, whichever that is).
*
* @param stateObs Observation of the current state.
* @param elapsedTimer Timer when the action returned is due.
* @return An action for the current state
*/
public Types.ACTIONS act(StateObservationMulti stateObs, ElapsedCpuTimer elapsedTimer) {
Types.ACTIONS bestAction = null;
double maxQ = Double.NEGATIVE_INFINITY;
//A random non-suicidal action by the opponent.
Types.ACTIONS oppAction = getOppNotLosingAction(stateObs, id, oppID);
SimpleStateHeuristic heuristic = new SimpleStateHeuristic(stateObs);
for (Types.ACTIONS action : stateObs.getAvailableActions(id)) {
StateObservationMulti stCopy = stateObs.copy();
//need to provide actions for all players to advance the forward model
Types.ACTIONS[] acts = new Types.ACTIONS[no_players];
//set this agent's action
acts[id] = action;
acts[oppID] = oppAction;
stCopy.advance(acts);
double Q = heuristic.evaluateState(stCopy, id);
Q = Utils.noise(Q, this.epsilon, this.m_rnd.nextDouble());
//System.out.println("Action:" + action + " score:" + Q);
if (Q > maxQ) {
maxQ = Q;
bestAction = action;
}
}
//System.out.println("======== " + getPlayerID() + " " + maxQ + " " + bestAction + "============");
//System.out.println(elapsedTimer.remainingTimeMillis());
return bestAction;
}
//Returns an action, at random, that the oppponent would make, assuming I do NIL, which wouldn't make it lose the game.
private Types.ACTIONS getOppNotLosingAction(StateObservationMulti stm, int thisID, int oppID)
{
int no_players = stm.getNoPlayers();
ArrayList<Types.ACTIONS> oppActions = stm.getAvailableActions(oppID);
ArrayList<Types.ACTIONS> nonDeathActions = new ArrayList<>();
//Look for the opp actions that would not kill the opponent.
for (Types.ACTIONS action : stm.getAvailableActions(oppID)) {
Types.ACTIONS[] acts = new Types.ACTIONS[no_players];
acts[thisID] = Types.ACTIONS.ACTION_NIL;
acts[oppID] = action;
StateObservationMulti stCopy = stm.copy();
stCopy.advance(acts);
if(stCopy.getMultiGameWinner()[oppID] != Types.WINNER.PLAYER_LOSES)
nonDeathActions.add(action);
}
if(nonDeathActions.size() == 0)
//Simply random
return oppActions.get(new Random().nextInt(oppActions.size()));
else
//Random, but among those that would not kill the opponent.
return (Types.ACTIONS) Utils.choice(nonDeathActions.toArray(), m_rnd);
}
}
The state evaluation is performed by the multi player version of the class SimpleStateHeuristic, when the method evaluateState is called. The following code shows how this method (from SimpleStateHeuristic.java) evaluates the given state. Note that it uses some of the methods described in the Forward Model, querying for positions of other sprites in the game.
public class SimpleStateHeuristic extends StateHeuristicMulti {
double initialNpcCounter = 0;
public SimpleStateHeuristic(StateObservationMulti stateObs) {
}
public double evaluateState(StateObservationMulti stateObs, int playerID) {
Vector2d avatarPosition = stateObs.getAvatarPosition(playerID);
ArrayList<Observation>[] npcPositions = stateObs.getNPCPositions(avatarPosition);
ArrayList<Observation>[] portalPositions = stateObs.getPortalsPositions(avatarPosition);
HashMap<Integer, Integer> resources = stateObs.getAvatarResources(playerID);
ArrayList<Observation>[] npcPositionsNotSorted = stateObs.getNPCPositions();
double won = 0;
int oppID = (playerID + 1) % stateObs.getNoPlayers();
Types.WINNER[] winners = stateObs.getMultiGameWinner();
boolean bothWin = (winners[playerID] == Types.WINNER.PLAYER_WINS) && (winners[oppID] == Types.WINNER.PLAYER_WINS);
boolean meWins = (winners[playerID] == Types.WINNER.PLAYER_WINS) && (winners[oppID] == Types.WINNER.PLAYER_LOSES);
boolean meLoses = (winners[playerID] == Types.WINNER.PLAYER_LOSES) && (winners[oppID] == Types.WINNER.PLAYER_WINS);
boolean bothLose = (winners[playerID] == Types.WINNER.PLAYER_LOSES) && (winners[oppID] == Types.WINNER.PLAYER_LOSES);
if(meWins || bothWin)
won = 1000000000;
else if (meLoses)
return -999999999;
double minDistance = Double.POSITIVE_INFINITY;
Vector2d minObject = null;
int minNPC_ID = -1;
int minNPCType = -1;
int npcCounter = 0;
if (npcPositions != null) {
for (ArrayList<Observation> npcs : npcPositions) {
if(npcs.size() > 0)
{
minObject = npcs.get(0).position; //This is the closest guy
minDistance = npcs.get(0).sqDist; //This is the (square) distance to the closest NPC.
minNPC_ID = npcs.get(0).obsID; //This is the id of the closest NPC.
minNPCType = npcs.get(0).itype; //This is the type of the closest NPC.
npcCounter += npcs.size();
}
}
}
if (portalPositions == null) {
double score = 0;
if (npcCounter == 0) {
score = stateObs.getGameScore(playerID) + won*100000000;
} else {
score = -minDistance / 100.0 + (-npcCounter) * 100.0 + stateObs.getGameScore(playerID) + won*100000000;
}
return score;
}
double minDistancePortal = Double.POSITIVE_INFINITY;
Vector2d minObjectPortal = null;
for (ArrayList<Observation> portals : portalPositions) {
if(portals.size() > 0)
{
minObjectPortal = portals.get(0).position; //This is the closest portal
minDistancePortal = portals.get(0).sqDist; //This is the (square) distance to the closest portal
}
}
double score = 0;
if (minObjectPortal == null) {
score = stateObs.getGameScore() + won*100000000;
}
else {
score = stateObs.getGameScore() + won*1000000 - minDistancePortal * 10.0;
}
return score;
}
}
-
GVG Framework
- Tracks Description
- Code Structure
- Creating Controllers
- Creating Multi Player Controllers
- Creating Level Generators
- Running & Testing Level Generators
- Creating Rule Generators
- Running & Testing Rule Generators
-
Forward Model and State Observation
- Advancing and copying the state
- Advancing and copying the state (2 Player)
- Querying the state of the game
- Querying the state of the game (2 Player)
- Information about the state of the Avatar
- Information about the state of the Avatar (2 Player)
- Information about events happened in the game
- Information about other sprites in the game
- Game Description Class
- Constraints
- Game Analyzer Class
- Level Analyzer Class
- Sprite Level Description Class
- Sprite, Termination, and Interaction Data Class
- Level Mapping Class
- Competition Specifications
- VGDL Language