Desiderata

About this project

The United Unity Universe (U3) is a framework developed for building and iterating on reinforcement learning environments in Unity 3D, utilizing both Unity and Python. Informed by the DeepMind paper, "Using Unity to Help Solve Intelligence," the design choices of U3 are influenced by the conventions presented therein. With a target audience of machine learning researchers, U3 offers interfaces to facilitate the efficient integration of diverse code for the purpose of implementing new environments and designing experiments to test the boundaries of an agent's capabilities. U3 also includes pre-defined environments such as GridWorld and OpenXLand, and further encourages researchers to contribute their own custom environments or modifications to the existing codebase.
The first goal for the project is to recapitulate XLand and Ada. This step will serve as a benchmark for making sure everything is running smoothly. Further, having a working version of Ada will allow us to test its capabilities. Specifically, I am interested in using the Ada model as a foundational model in novel tasks, and testing its zero or few shot capabilities. For example, can Ada adapt to a reskinned environment (just changing textures/models)? Can it adapt to a new set of rules/objects? Can it be used as a foundational model on new rules/objects? Can it then adapt to more complex tasks that require a combination of rules (Like using an object to build a bridge across a gap)? Finally, can Ada be used as a foundational model for a completely different environment as long as the basic controller is the same (new tasks, new visuals, new rewards)?

More details: https://docs.google.com/document/d/1iQD3saJDTD4fPgFONWtWWoQOIJ4glgURXhtg5951-VQ/edit

Desiderata

Dynamic environment setup and probing: ML researchers need the ability to customize and probe environments to evaluate task difficulty and select environments appropriate for the current policy. U3 enables complete environmental customization from the python side before starting an episode, allowing researchers to find or create an appropriate environment instance for training.
Dynamic environment loading: Saving and loading environments during testing phases or environmental setup can be useful for debugging and improving models. The U3 framework facilitates easy serialization of environments, with a focus on interpretability rather than speed. This framework also facilitates experimentation for testing how the trained agent reacts to certain situations in the environment.
Multiple independent environments in a single Unity instance: Running many instances of environments in the same Unity process reduces the cost of the Unity engine overhead. Our framework also allows for a single Unity instance to run distinct environments (such as a Gridworld and a 3D environment) simultaneously.
Multiple agents in a single environment: Multi-agent environments are useful both for multi-agent research but also for modifying task difficulty. U3 supports multiple agents in a single environment using the petting-zoo interface. The decision interval for each agent is not fixed, allowing for greater flexibility in training models.
Python interface using Docker, gym/petting zoo, and a simple API for environmental manipulations: U3 is designed to be a community tool, and as such, it uses existing standards for all interfaces into the framework. The environmental API is environment-specific, but the basic functionality such as serialization is done through a standardized JSON format. Docker is used for the Unity instances to increase reproducibility and facilitate easier setup.
Modular code design: To make U3 accessible to as many researchers as possible, the framework encourages modular code using Unity components. U3 comes with several basic environments already defined, but users can create new objects and add new features to those objects using components. This means that community code can be mixed and matched to fit the unique requirements of each project with minimal coding.
Tools for environmental initialization: The ability to randomize environment layouts is important for creating diverse and challenging environments for training models. U3 provides a number of tools for environmental initialization, such as wave function collapse and compositional pattern-producing networks.
Tools for human experiments: In order to compare models to human-level baseline, or to gather expert trajectories from human players U3 provides a simple human interface both as a standalone application and as a web plugin.

Implementation Details

The project is split into Unity and Python sides. Unity deals with the code to set up RL environments within the Unity game engine itself. Python deals with wrappers for interfacing with the environment during training. This entails a PettingZoo wrapper and extra API calls that enable U3 specific functionality (such as probing environment complexity).

Name		Name	Last commit message	Last commit date
Latest commit History 250 Commits
.vscode		.vscode
ml-agents		ml-agents
scripts		scripts
u3gym		u3gym
unity		unity
.gitignore		.gitignore
README.md		README.md
launch.sbatch		launch.sbatch
main.py		main.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py
train.py		train.py
u3.code-workspace		u3.code-workspace

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About this project

Desiderata

Implementation Details

About

Releases

Packages

Contributors 6

Languages

CERC-AAI/u3

Folders and files

Latest commit

History

Repository files navigation

About this project

Desiderata

Implementation Details

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 6

Languages

Packages