Pokémon Champion

This is the implementation for the paper "PokéChamp: an Expert-level Minimax Language Agent for Competitive Pokémon"

Requirements:

conda create -n pokechamp python=3.12
conda activate pokechamp
pip install -r requirements.txt

Configuration

Configuring a Local Pokémon Showdown Server (Battle Engine)

Install Node.js v10+.
Clone the Pokémon Showdown repository and set it up:

git clone https://github.com/smogon/pokemon-showdown.git
cd pokemon-showdown
# Optional: All repo features were tested with the following showdown version.
# git reset --hard dd4b004e54d4ef8c66c8b583a8fa64b020574727
npm install
cp config/config-example.js config/config.js
node pokemon-showdown start --no-security

Enter "http://localhost:8000/" in your browser.

Reproducing Paper Results

Gen 9 OU Battles

To evaluate a group of agents on Gen 9 OU format:

python evaluate_gen9ou.py

This script will run battles between PokéChamp and the baseline bots, including PokéLLMon, Abyssal Bot, and others. It will output win rates, Elo ratings, and average number of turns per battle.

Additional Experiments

Battle Any Agent Against Any Agent Locally

python local_1v1.py

Battle Against Any Agent Locally

First, log into your other account manually on the local server, choosing "[Gen 9] Random Battle".

python human_agent_1v1.py

Battle Against Ladder Players on Pokémon Showdown

Register an account on https://play.pokemonshowdown.com/ and get your password.

Open and log in: https://play.pokemonshowdown.com/

python showdown_ladder.py --USERNAME $USERNAME --PASSWORD $PASSWORD # fill in your username and password for PokéChamp, no need to set up local server.

Benchmark Puzzles - Coming Soon with dataset release!

This requires download of our dataset (release TBD). To reproduce the action prediction results:

python evaluate_action_prediction.py

This script will analyze the dataset and output prediction accuracies for player and opponent actions across different Elo ratings.

Acknowledgement

The environment is implemented based on PokeLLMon and Poke Env. This work provides an implementation of the PokéChamp paper:

@article{karten2025pokechamp,
  title={PokéChamp: an Expert-level Minimax Language Agent for Competitive Pokémon},
  author={Karten, Seth and Nguyen, Andy Luu and Jin, Chi},
  journal={arXiv preprint arXiv:2503.04094},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
poke_env		poke_env
resource		resource
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
common.py		common.py
evaluate_gen9ou.py		evaluate_gen9ou.py
human_agent_1v1.py		human_agent_1v1.py
local_1v1.py		local_1v1.py
requirements.txt		requirements.txt
showdown_ladder.py		showdown_ladder.py
whr.py		whr.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pokémon Champion

Requirements:

Configuration

Configuring a Local Pokémon Showdown Server (Battle Engine)

Reproducing Paper Results

Gen 9 OU Battles

Additional Experiments

Battle Any Agent Against Any Agent Locally

Battle Against Any Agent Locally

Battle Against Ladder Players on Pokémon Showdown

Benchmark Puzzles - Coming Soon with dataset release!

Acknowledgement

About

Releases

Packages

Languages

License

evelynmitchell/pokechamp

Folders and files

Latest commit

History

Repository files navigation

Pokémon Champion

Requirements:

Configuration

Configuring a Local Pokémon Showdown Server (Battle Engine)

Reproducing Paper Results

Gen 9 OU Battles

Additional Experiments

Battle Any Agent Against Any Agent Locally

Battle Against Any Agent Locally

Battle Against Ladder Players on Pokémon Showdown

Benchmark Puzzles - Coming Soon with dataset release!

Acknowledgement

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages