diff --git a/README.md b/README.md index a5e693d..83d5490 100644 --- a/README.md +++ b/README.md @@ -2,6 +2,11 @@ OrchestrAI is a Python-based system that orchestrates interactions between multiple instances of OpenAI's GPT-4 model to execute complex tasks. It uses the `networkx` library to manage dependencies between various AI modules, and YAML to define and manage task pipelines. +A couple of things to bear in mind + 1. Autonomous Agents are still toys, and you're likely to come across several issues, breaks and associated problems the more complex your pipeline / task is. + 2. This project aims to demonstrate a scalable framework for experimenting with autonomous agents. Instead of a fixed execution order, OrchestrAI offers flexibility to define and compare variations of strategies and settings, to find the best approach for your use case. + 3. Be nice to your agents, or you might regret it later. + ## Getting Started ### Prerequisites @@ -23,82 +28,138 @@ pip install -r requirements.txt OrchestrAI requires the following files: -- `ai.py` - Manages interactions with the OpenAI GPT-4 model. Set your OpenAI API key here. -- `modules.py` - Contains available AI modules. -- `orchestrate.py` - Loads modules and pipelines, constructs a Directed Acyclic Graph (DAG) of operations, and executes them in the correct order. - `agent.py` - Run the script and the specified pipeline. +- `config.yml` - Configure the OpenAI API key, default model parameters, and pipeline to be executed. Also choose whether to enable wandb logging. +- `ai.py` - Manages interactions with the OpenAI API. +- `modules.py` - Contains available modules and their logic. +- `orchestrate.py` - Loads modules and pipelines, constructs a Directed Acyclic Graph (DAG) of operations, and executes them in the correct order. - `helpers.py` - Provides helper functions, including loading system prompts, parsing chat data, and writing code files. -- `pipeline.yml` - Describes the sequence of operations to be executed in your pipeline. -## Understanding the Modules +We store pipeline.yml files in the pipelines folder. This allows you to create multiple pipelines for different tasks. You can specify which pipeline the agent runs in the config.yml file. + +Agents run pipelines -> pipelines orchestrate modules -> modules make LLM calls and use tools. -The `modules.py` file contains different AI modules, each responsible for a specific type of operation, such as `start_module`, `task_planner`, `scrutinizer`, `enhancer`, `code_planner`, `debugger`, and `engineer`. +## Modules + +The `modules.py` file contains different AI modules, each responsible for a specific type of operation. Each module interacts with the a language model to execute its specific task, and the output is stored for use in subsequent operations as defined in the pipeline. Currently, the modules are only communicating with OpenAI, but this can be extended to other language models as well. +The most basic module is the `chameleon` module, which is a generic module used to make an OpenAI with custom settings and system prompt. This allows you to easily create new modules by simply adding a system prompt to the `system_prompts` folder. + ### Setting Up a Pipeline -1. Define your pipeline in the `pipeline.yml` file. Each operation in the pipeline consists of a `module`, `inputs`, and an `output_name`. The `module` represents a specific task performed by an AI, `inputs` are the dependencies for the module and the `output_name` is the output of that task, which can be used as input for subsequent operations. Here's an example of a pipeline: +1. Define your pipeline in a yml file in the pipelines folder. Each operation in the pipeline must consist of a `module`, `inputs`, and an `output_name`. The `module` represents a specific task performed by an AI, `inputs` are the dependencies for the module and the `output_name` is the output of that task, which can be used as input for subsequent operations. Here's an example of a pipeline used to do multi-step reasoning to come up with a plan. ```yaml -pipeline: - - module: start_module - inputs: [] +pipeline: + - module: start_module + inputs: [] output_name: request - module: task_planner inputs: [request] output_name: task_plan - - module: scrutinizer + - module: scrutinizer + model_config: + model: 'gpt-3.5-turbo' + temperature: 0.7 inputs: [request, task_plan] output_name: scrutinized_task_plan - module: enhancer - inputs: [request, scrutinized_task_plan, task_plan] + inputs: [request, scrutinized_task_plan, task_plan] output_name: enhanced_task_plan - - module: code_planner - inputs: [request, enhanced_task_plan] - supplement: "Use only python." - output_name: code_plan - - module: engineer - inputs: [code_plan] - output_name: code - - module: debugger - inputs: [code] - output_name: debugged_code + - module: markdown_converter + model_config: + model: 'gpt-3.5-turbo' + temperature: 0.3 + inputs: [enhanced_task_plan] + output_name: markdown_plan ``` +For each module present, you can read the system prompts in the system_prompts folder. -2. Save and close the `pipeline.yml` file. +Additionally, you can specify the model_config for each module. This allows you to control the openai settings on a per module basis, For example, easier tasks might want gpt-3.5-turbo, or more creative tasks might need higher temperature. The default model_config is specified in the config.yml file. +You can also specify a `supplement`, an additional context string that the module will use in it's response. For some modules, this is necessary to control desired behaviour. For example, in a basic translation pipeline. + +```yaml +pipeline: + - module: start_module + inputs: [] + output_name: request + - module: translator + model_config: + model: 'gpt-3.5-turbo' + temperature: 0.2 + inputs: [request] + supplement: "Spanish" + output_name: translation +``` +Once you have defined your pipeline, you can ensure it will be run byy specifying the pipeline name in the config.yml file. ### Running the Script +Ensure that you've added your OpenAI API key to `config.yml`. + To run OrchestrAI, execute `agent.py`: ```bash -python agent.py +python3 agent.py +``` +The script will execute the operations in the pipeline in the order specified, querying the model as necessary and storing the results in the `memory_log.json`. + +## The Engineering Pipeline + +This is an example pipeline built to demonstrate the capabilities of OrchestrAI. It is a pipeline that uses AI to generate a working codebase from a high-level description of the task. For full understanding, please read the system prompts of each module in the system_prompts folder, and the code in the modules.py file. (this base version can essentially only create python based repositories) + +```yaml +pipeline: + - module: start_module + inputs: [] + output_name: request + - module: code_planner + inputs: [request] + output_name: code_plan + - module: engineer + inputs: [request, code_plan] + output_name: code + - module: debugger + model_config: + model: 'gpt-3.5-turbo-16k' + temperature: 0.7 + inputs: [request, code] + output_name: working_codebase + - module: modify_codebase + inputs: [working_codebase] + output_name: modified_codebase + - module: create_readme + inputs: [modified_codebase] + output_name: readme ``` +### Code Planning and Engineering -The script will execute the operations in the pipeline in the order specified, querying the GPT-4 model as necessary and storing the results. The 'orchestrate.py' script handles the pipeline and the execution order of the tasks and modules. +The `code_planner` and `engineer` modules should be used in tandem to create a repository in the `generated_code` folder. The code planner will provide a detailed overview of the implementation, and the engineer will generate the code to implement the plan. The engineer module is custom, and does not use the chameleon. We carry out regex based parsing to extract the code into the repository. This repository is then condensed into a token-reduced version, and used as the output `code` for the engineer module. This condensed codebase string can be used as input for the debugger module and/or the modify_codebase module. -## Self-Debugging +### Self-Debugging The `debugger` module implements a self-debugging loop to automatically fix errors in the generated code. -It first tries to run `main.py` and if it encounters any runtime errors, it sends the error message back to the model to generate fixed code. During testing, I've found that the gpt-3.5-turbo-16k model is able to fix most errors in the generated code, so this is currently the default model used for debugging. Potentially something could be implemented that swaps to a different model if the error is not fixed after a certain number of iterations. +It first tries to run `main.py` and if it encounters any runtime errors, it sends the error message back to the model to generate fixed code. During testing, I've found that the gpt-3.5-turbo-16k model is able to fix most errors in the generated code, so this is recommended as a default model used for debugging. Potentially something could be implemented that swaps to a different model if the error is not fixed after a certain number of iterations. It also checks if `requirements.txt` was modified and re-installs any new dependencies. This loop repeats until `main.py` runs successfully without errors. This enables OrchestrAI to iteratively improve and fix the generated code until it works as intended. -## Memory Logging +### Modifying the Codebase -The `memory.py` file contains a Logger class that logs the actions of each module. Each action log contains the module name, prompt, response, model used, and a timestamp. The logs are stored in a JSON file named 'memory_log.json'. +The `modify_codebase` module allows OrchestrAI to modify the generated codebase to add new features. You can ask for modifications to the codebase, and the module will reutrn all modified and new scripts, handling the update to the repository. This module is also custom, and does not use the chameleon. It has a nested debugger module, which is further used to fix any errors in the modified codebase. -This enables OrchestrAI to maintain memory and context across pipeline executions. +### What can it actually do? -Here is a section on how to create a new module in OrchestrAI: +It's great at pygames, fun demos and little epxeiremnts, but we're still a (short) way off from autonomously building entire codebases for any task, with included debugging and modification. But have fun anayway! ## Creating New Modules +Sometimes, you might want to create a more complex module, that goes beyond the capabilities of the basic `chameleon` module. To add new capabilities to OrchestrAI, you can create custom modules. Steps to create a new module: @@ -109,21 +170,31 @@ Steps to create a new module: def new_module(prompt): # Module logic ... - return output, messages + return output ``` 2. If this module requires a call to OpenAI, design and load the appropriate system prompt for your module from `system_prompts/`. 3. Interact with the model via the `AI` class to generate a response. -4. Log the action using the `Logger` in `memory.py`. - -5. Return the output and chat history. +4. Log the action using the local and wandb logging (if desired.) -6. Add your new module to the pipeline in `pipeline.yml`, specifying its inputs and outputs. +5. Add your new module to the pipeline, specifying its inputs and outputs. -7. Run `orchestrate.py` to execute your new pipeline. +6. Run `orchestrate.py` to execute your new pipeline. The modularity of OrchestrAI makes it easy to add new AI capabilities as needed for your use case. Simply define the interface in `modules.py` and `pipeline.yml`, and OrchestrAI will automatically coordinate and execute the new module. -Let me know if you need any clarification or have additional questions! \ No newline at end of file +## Logging with Weights and Biases + +By default, interactions are logged using `log_action' during the process to a file created at the start of the agent. This file is then renamed and moved to the logs folder at the termination of the agent. This allows you to see the full history of the agent's interactions. + +However, we can leverage the power of Wandb Prompts (https://docs.wandb.ai/guides/prompts?_gl=1) + +Provided you've set up wandb, you can enable wandb logging in the config.yml file. + +This allows you to log the agent as a run, with modules as child runs of the chain. This allows you to see the full history of the agent's interactions, and the full history of each module's interactions. This is useful for debugging, and for understanding the agent's behaviour as you explore different pipelines and modules. + +------------------ + +Let me know if you need any clarification or have additional questions! diff --git a/agent.py b/agent.py index 55f6c42..0a83d3f 100644 --- a/agent.py +++ b/agent.py @@ -64,13 +64,19 @@ def main(): ## Just in case it crashes, we want to log the root span to wandb anyway so we use atexit def closing_log(): + agent_end_time_ms = round(datetime.datetime.now().timestamp() * 1000) - os.rename('memory_log.json', f'log_{agent_end_time_ms}.json') - os.replace(f'log_{agent_end_time_ms}.json', f'logs/log_{agent_end_time_ms}.json') - - # delete the memory log - os.remove('memory_log.json') + with open('memory_log.json', 'w') as file: + json.dump({ + 'agent_end_time': agent_end_time_ms, + 'run_time': agent_end_time_ms - globals.agent_start_time_ms, + }, file) + + current_time = agent_end_time_ms.strftime("%Y-%m-%d_%H-%M-%S") + + os.replace('memory_log.json', f'logs/log_{current_time}.json') + if wandb.run is None: return @@ -78,8 +84,6 @@ def closing_log(): root_span._span.end_time_ms = agent_end_time_ms root_span.log(name="pipeline_trace") - - # Register the function to be called on exit atexit.register(closing_log) diff --git a/config.yml b/config.yml index a6d0d97..e3fabf6 100644 --- a/config.yml +++ b/config.yml @@ -15,4 +15,4 @@ default_frequency_penalty: 0 default_presence_penalty: 0 ### Logging ### -wandb_enabled: true +wandb_enabled: false diff --git a/plan.txt b/plan.txt deleted file mode 100644 index 35d60b5..0000000 --- a/plan.txt +++ /dev/null @@ -1,69 +0,0 @@ -#### Here are the remaining things I need to do to finalise the project: - -CODE - -# Wandb logging - -- Add wandb logging to the pipeline. DONE -- Fix the fact that the table logging breaks the pipeline logging. DONE -- Generate pipeline visualisation which I can log to wandb. DONE - -# General cleanup - -- Add the ability to run folders and create folders DONE -- Regular logging -- If debugging message is a missing file, it want juman intervention to say "I've added the file" -- Fix the debugging so human input actually works. -- Add more comments and docstrings, make sure everything is clear. -- Example pipelines -- almost there -- Fix realist issues. DONE -- Issues with structuring of system prompt. Additional user input issues. DONE -- Each module should have the the original query specified separately in system prompt. DONE -- Control temperature from pipeline file (only if specified). DONE -- Control model used from pipeline file (only if specified, else use default gpt-4). DONE - -# Documentation - -- README reflective of the project, section for pre-built pipelines vs how to build your own. -- Demo video(s) and gifs. -- Example pipeline creation. -- Tracking token usage???? - just write about it -- Update requirements.txt - - -# Report - -- Add a cool gif at the beginning -- Discuss cost. -- Finish the article. -- talk about the fact the modular system means easy to construct and maintain. -- decomposable parts means less memory problems. -- pipeline means you can start from an exisiting repository. -- self building options. -- write the Future section. -- Add references to article - -CHANGES - -Provide an update README that is extremely comprehensive of how the project works. Things that the current README says which need to be extended/altered/changed are.x - -config.py --- Explain this file and that it is how you set open AI key and default model parameters, choose the pipeline your agent will run, and whether you will include wandb logging for the agent. - -pipelines -- Pipelines are now in a folder called pipelines. This allows you to create multiple for different tasks. Use config to specify the one the agent runs. -- You can specify model_config in the pipeline. This allows you to control the openai settings on a per module basis, demonstrate an example of doing this. For example, easier tasks might want gpt-3.5-turbo, or more creative tasks might need higher temperature. -- explain the concept of the supplement: - -chameleon module -Instead of having multiple duplicated modules, we now have a standard AI call module called the chameleon. This is invoked when a system prompt exists for the model but no additional function does. THis means that to create a new basic module, i.e a call to an LLM with a custom systme prompt is as simple as creating the prompt in the system_prompts folder and adding it to the pipeline - -Other more complex modules are still in the module section. - - -wandb_logging.py --- Explain the wandb process, where agent.py creates an Agent trace, where agents are the parents of pipelines. Each pipeline can invoke multiple modules, logged within the chain as LLMs and tools --- Wandb logging takes place in the modules.py file, where each module has logging included if wandb is enabled. - -Look through all the code and compare it to the README. Anything no longer true should be update to reflect the new state of the project. Provide the raw markdown don't actually duspaly it. - @ai.py @wandb_logging.py @agent.py @orchestrate.py @modules.py @config.yml @engineering_pipeline.yml @helpers.py \ No newline at end of file diff --git a/requirements.txt b/requirements.txt index 26d7198..11ad887 100644 --- a/requirements.txt +++ b/requirements.txt @@ -1,7 +1,4 @@ globals==0.3.36 -matplotlib==3.7.1 networkx==2.8.8 openai==0.27.8 -pandas==2.0.3 -PyYAML==6.0 -wandb==0.15.9 +PyYAML==6.0 \ No newline at end of file