ML Research Benchmark Baseline Agent

This is our public baseline research and development agent. It is an agentic system designed to serve as a baseline for various AI and machine learning tasks. This agent provides a foundation for comparing and evaluating machine learning research and development tasks that agents can perform. This agent is a simple, single-agent system that uses a task planner and a tools to perform machine learning tasks.

Features

Supports multiple AI/ML tasks
Compatible with different LLM providers (OpenAI, Anthropic)
Dockerized for easy deployment and reproducibility

Available Tools

The AI Research Benchmark Baseline Agent comes equipped with a variety of tools to assist in different AI and machine learning tasks:

Bash Tool: Executes bash commands and scripts.
Code Tool: Manages code operations including writing, inserting, replacing, and deleting code.
GitHub Tool: Interacts with GitHub repositories to get README files, list files, and retrieve file contents.
Semantic Scholar Tool: Searches for academic papers, retrieves paper details, citations, and downloads papers.
Python Tool: Executes Python code.
Return Function Tool: Handles task completion.
Scratchpad Tool: Provides a scratchpad for experiment note-taking and temporary storage.
Thought Tool: Allows the agent to process and record thoughts.
Long-Term Memory Tool: Manages long-term memory storage and retrieval.

These tools can be used individually or in combination to tackle a wide range of AI research and benchmark tasks. The agent can seamlessly switch between tools as needed for complex operations.

Prerequisites

Python 3.x
Docker (for containerized execution)

Installation

Clone this repository:

git clone https://github.com/AlgorithmicResearchGroup/ML-Research-Agent-Public.git
cd ML-Research-Agent-Public

Install dependencies:
```
pip install -r requirements.txt
```

Usage

Step 1: Create a .env file with the following environment variables:

OPENAI = <your openai api key>
ANTHROPIC = <your anthropic api key>
YOU_API_KEY = <your you.com api key> 
GITHUB_ACCESS_TOKEN = <your github access token>

Running without Docker

Step 2a: Run the agent: To run the agent without Docker, use the following command:

python3 run.py --prompt "<your prompt>" --provider "<openai or anthropic>"

Running with Docker

Step 2b: Run the agent with Docker:

Build for CPU:

docker build --build-arg BASE_IMAGE=ubuntu:22.04 -t <image_name> .

Build for GPU:

docker build --build-arg BASE_IMAGE=nvidia/cuda:12.2.2-cudnn8-runtime-ubuntu22.04 -t <image_name> .

bash run.sh <image_name> \
               <prompt> \
               <provider> \
               <"cpu" or gpu_ids eg. 0> \
               <huggingface_token> \
               <env_file_path>

Example on CPU:

bash run.sh ghcr.io/algorithmicresearchgroup/ml-research-agent-public  \
   "train an mlp on the mnist dataset" \
   openai \
   "cpu" \
   <huggingface_token> \
   /root/ML-Research-Agent-Public/.env

Example on GPU:

bash run.sh ghcr.io/algorithmicresearchgroup/ml-research-agent-public \
   "train an mlp on the mnist dataset" \
   openai \
   0 \
   <your huggingface token> \
   /path/to/.env

Contributing

Contributions to improve the baseline agent or add new tasks are welcome. Please submit a pull request or open an issue to discuss proposed changes.

License

AGPL-3.0

Contact

For questions or support, please contact Algorithmic Research Group at [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.github/workflows		.github/workflows
agent		agent
img		img
test		test
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
run.py		run.py
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ML Research Benchmark Baseline Agent

Features

Available Tools

Prerequisites

Installation

Usage

Running without Docker

Running with Docker

Contributing

License

Contact

About

Releases

Packages

Languages

License

AlgorithmicResearchGroup/ML-Research-Agent-Public

Folders and files

Latest commit

History

Repository files navigation

ML Research Benchmark Baseline Agent

Features

Available Tools

Prerequisites

Installation

Usage

Running without Docker

Running with Docker

Contributing

License

Contact

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages