-
Notifications
You must be signed in to change notification settings - Fork 48
Create controllers
The Single-Player Learning track accepts submission of agent written in Java or Python. The Python client has been tested with Python3.5. On the server, TensorFlow has been installed for Python3.5.
This page details the methods to be implemented in the Java agent or Python agent.
At each game tick, 3 types of serialised state observation can be requested by the agent. This section explains the legal serialised state observation types and how to set it.
The StateObservation
is the observation of the current state of the game, which can be used in deciding the next action to take by the agent (see doc for planning track for detailed information), but the events are not included.
The screenshot is a png of the actual game screen without frame border and title.
-
Types.LEARNING_SSO_TYPE.JSON
: serialisedStateObservation
without screenshot. -
Types.LEARNING_SSO_TYPE.IMAGE
: screenshot andgameScore
,gameTick
,gameWinner
,isGameOver
andavailableActions
. -
Types.LEARNING_SSO_TYPE.BOTH
: serialisedStateObservation
and screenshot.
For more details, please refer to SerializableStateObservation.
-
Java agent: for instance, your agent can set the serialised state observation type by doing
lastSsoType = Types.LEARNING_SSO_TYPE.JSON;
-
Python agent: for instance, your agent can set the serialised state observation type by doing
self.lastSsoType = LEARNING_SSO_TYPE.JSON
By default, Types.LEARNING_SSO_TYPE.JSON
is set for Java agent and LEARNING_SSO_TYPE.JSON
for Python agent.
You can re-set the serialised state observation type when a game is stated, initialised, being played or finished (terminate normally or abort):
- When a game is started: set the serialised state observation type in the constructor of the Agent class, thus
Agent(...)
. - When a game is initialised: set the serialised state observation type in
init(...)
. - When a game is being played: set the serialised state observation type in
act(..)
. - When a game is finished: set the serialised state observation type in
results(...)
.
A sample random agent: Agent.java.
The Agent class should inherit from utils/AbstractPlayer.java.
public Agent(SerializableStateObservation sso, ElapsedCpuTimer elapsedTimer){...}
The constructor receives two parameters:
-
SerializableStateObservation sso
: SerializableStateObservation is the serialisedStateObservation
without forward model, which is a String or screenshot (.png). -
ElapsedCpuTimer elapsedTimer
: TheElapsedCpuTimer
is a class that allows querying for the remaining CPU time the agent has to return an action. You can query for the number of milliseconds passed since the method was called (elapsedMillis()
) or the remaining time until the timer runs out (remainingTimeMillis()
). The constructor has 1 second. IfremainingTimeMillis()
≤ 0, this agent is disqualified in the game being played.
public Types.ACTIONS init(SerializableStateObservation sso, ElapsedCpuTimer elapsedTimer){...}
The init
method is called once after the constructor, before selecting any action to play. It receives two parameters:
-
SerializableStateObservation sso
. -
ElapsedCpuTimer elapsedTimer
: (see previous section) Theact
has to finish in 40ms, otherwise, theNIL_ACTION
will be played.
public Types.ACTIONS act(SerializableStateObservation sso, ElapsedCpuTimer elapsedTimer){...}
The act
method selects an action to play at every game tick. It receives two parameters:
-
SerializableStateObservation sso
. -
ElapsedCpuTimer elapsedTimer
: The timer with maximal time 40ms for the whole training. Theact
has to finish in 40 ms, otherwise, this agent is disqualified in the game being played.
The agent can abort the current game by returning the action ACTION_ESCAPE
. The agent will receive the results and serialised state observation sso
of the unfinished game, timer and returns the next level to play using the method int result(...)
.
public int result(SerializableStateObservation sso, ElapsedCpuTimer elapsedTimer) {...}
During the step 2 of training, after terminating a game and receiving the results and final game state, the agent is supposed to select the next level to play. If the return level id $$$\not\in \{0,1,2\}$$$, then a random level id $$$\in \{0,1,2\}$$$ will be passed and a new game will start. The result
method receives two parameters:
-
SerializableStateObservation sso
: the serialised observation of final game stat at termination. -
ElapsedCpuTimer elapsedTimer
: The global timer with maximal time 5 mins for the whole training. If there is no time left (remainingTimeMillis()
≤ 0), an extract timer with maximal time=1 second will be passed.
A sample random agent: Agent.py.
The Agent class should inherit from utils/AbstractPlayer.py.
The interface of the Python agent remains the same as the Java agent.
You may also be interested in:
-
GVG Framework
- Tracks Description
- Code Structure
- Creating Controllers
- Creating Multi Player Controllers
- Creating Level Generators
- Running & Testing Level Generators
- Creating Rule Generators
- Running & Testing Rule Generators
-
Forward Model and State Observation
- Advancing and copying the state
- Advancing and copying the state (2 Player)
- Querying the state of the game
- Querying the state of the game (2 Player)
- Information about the state of the Avatar
- Information about the state of the Avatar (2 Player)
- Information about events happened in the game
- Information about other sprites in the game
- Game Description Class
- Constraints
- Game Analyzer Class
- Level Analyzer Class
- Sprite Level Description Class
- Sprite, Termination, and Interaction Data Class
- Level Mapping Class
- Competition Specifications
- VGDL Language