Executing python code directly from AI output #286

luiz00martins · 2023-04-06T01:39:06Z

I've encountered the same problem as #101. GPT Thinks that evaluate_code will execute python:

CODEREVIEW-GPT THOUGHTS: Let's find some code to review and suggest improvements.
REASONING: We need to search for files with common code extensions that are not too large and more recent than a certain date.
PLAN:
-  Use the 'evaluate_code' command to run a Python script that searches for files with common code extensions in the current folder and subfolders.
-  Use the 'memory_add' command to save the file paths for later reference and reuse.
-  Review the saved files using 'read_file' command to ensure they are indeed code files.
-  Choose a file to review and suggest improvements.
CRITICISM: We may need to adjust the search criteria if we don't find relevant code or if too many files match the criteria.
NEXT ACTION:  COMMAND = evaluate_code ARGUMENTS = {'code': 'import pathlib\nfrom typing import List\n\ndef find_code_files(root: str, extensions: List[str], max_size: int, min_date_modified: float) -> List[str]:\n """\n Returns a list of file paths for code files with the specified extensions, maximum size, and minimum date modified\n rooted in the given directory and its subdirectories.\n """\n files = [str(path) for path in pathlib.Path(root).rglob("*") if path.suffix in extensions\n and path.stat().st_size <= max_size\n and path.stat().st_mtime >= min_date_modified]\n return files\n\ndef print_files(files: List[str])
 -> None:\n """\n Prints the list of file paths to the console, one file per line.\n If no relevant files are found, prints a message indicating so.\n """\n for file in files:\n print(file)\n else:\n print("No
 relevant files found.")\n\n\ndef analyze_code(code: str) -> None:\n """\n Analyzes the given code and provides suggestions for improvements.\n """\n pass\n\nif __name__ == \'__main__\':\n root: str = \'.\'\n extensions: List[str] = [\'.py\', \'.js\', \'.java\']\n max_size: int = 10000 # bytes\n min_date_modified: float = 1609459200 # seconds since epoch (01/01/2021)\n\n files: List[str] = find_code_files(root, extensions, max_size, min_date_modified)\n print_files(files)\n'}

On one hand, that's a bug, which is addressed on #101.

On the other hand... that's a very interesting idea from GPT. Perhaps giving it the ability to execute python code could allow it to execute a lot of tasks in a way more dynamic way.

The text was updated successfully, but these errors were encountered:

drammen94 · 2023-04-06T09:09:06Z

What do you mean. It's already s function in the project.

luiz00martins · 2023-04-06T10:02:02Z

Oh, really? My bad then 😄

What's the name of the command?

drammen94 · 2023-04-06T10:30:32Z

execute_python_file(arguments["file"])

luiz00martins · 2023-04-06T10:51:36Z

Is the file argument a path or actual code?

The feature I'm proposing is the direct execution of code. "file" makes it seem like it's a path to a file already in the system.

yourfavtheo · 2023-04-13T18:26:00Z

Same problem here too :

NEXT ACTION:  COMMAND = execute_python_file ARGUMENTS = {'file': '<path_to_python_script>'}
Executing file '<path_to_python_script>' in workspace 'auto_gpt_workspace'
SYSTEM:  Command execute_python_file returned: Error: Invalid file type. Only .py files are allowed.

I asked it to create a python script and it just try to execute "path_to_python_script".
Any solution ?

Pwuts · 2023-04-18T00:22:20Z

Closing as ~~duplicate of #101~~

luiz00martins · 2023-04-18T05:32:58Z

Not a duplicate. This is a feature request for direct python code execution.

Pwuts · 2023-04-18T05:53:24Z

That is already implemented

luiz00martins · 2023-04-18T06:02:18Z

As execute_python_file? I'm not sure if that's the same thing as I stated on my original message.

As I said:

Is the file argument a path or actual code?

The feature I'm proposing is the direct execution of code. "file" makes it seem like it's a path to a file already in the system.

If you want to close as "won't do", that's okay. But I don't think it's a duplicate.

Pwuts · 2023-04-18T10:17:25Z

Ah, thanks for elaborating! I think this is something we could add without too much effort. The tricky thing is to properly sandbox it, in a way equivalent to execute_python_file.

ChatGPT is less confused by this phrasing From my own observations and others (ie #101 and #286) ChatGPT seems to think that `evaluate_code` will actually run code, rather than just provide feedback. Since changing the phrasing to `analyze_code` I haven't seen the AI make this mistake. --------- Co-authored-by: Reinier van der Leer <[email protected]>

Boostrix · 2023-05-05T15:45:17Z

this has more to do with in-memory execution of code that isn't written to disk, I suppose ?
If so, that's related to the API discovery idea #56

luiz00martins · 2023-05-05T23:58:20Z

Yeah, it is somewhat related.

I think this issue might supersede that one/that issue might supersede this one depending on how it's implemented. Although this one is a bit more general (e.g. the Agent might spin up a python instance just to do some calculations, so nothing necessarily related to an API).

Edit: As a matter of fact, now that I think about it, I think these should be separate tasks. Meaning, a search_for_api task, followed by a write_python_code task, which would use that knowledge. That would keep the system more general, but fulfil the capabilities of #56.

Boostrix · 2023-05-06T05:39:38Z

search_for_api would be a specialization of a do_research (crawl) command, where as api could be either a classical API or a networking API

Some of us have succeeded getting Agent-GPT to write code by exploring API docs already - my recent experiments made it download the github API docs and come up with a CLI tool to filter PRs based on excluding those that are touching the same paths/files: master...Boostrix:Auto-GPT:topic/PRHelper.py

While this is trivial in nature, it can already be pretty helpful to identify PRs that can be easily reviewed/integrated, because they're not stepping on anyone's toes. And it would be easy to extend as well.

The point being, having some sort of search API / extend yourself mechanism is exactly what many folks here are suggesting when it comes to "self-improving", in its simplest form: adding features without having to write much/any code.

So, thinking about it, I am inclined to think that commands should be based on classes that can be extended - a research command would be based on a crawler/spider class [http requests: #2730), and a find_api command would be based on the research command [class]

That way, you can have your cake and eat it, while also ensuring that the underlying functionality (searching/exploring the solution space), is available for other use-cases - like the idea of hooking up the agent to a research paper server (#826 ) or making it process pdf files (#1353 )

Commands in their current form worked, but to support scaling and reduce code rot, it would make sense to identify overlapping functionality and then use a layered approach for common building blocks.

The "API explorer" you mentioned could also be API based itself, so there is no need to go through HTML scraping - but some folks may need exactly that, so a scraping mechanism would be a higher-level implementation of a crawler #2730

Related talks collated here: #514 (comment)

github-actions · 2023-09-17T01:54:12Z

This issue was closed automatically because it has been stale for 10 days with no activity.

Boostrix · 2023-10-04T17:33:13Z

regarding the API explorer idea: #5536

ChatGPT is less confused by this phrasing From my own observations and others (ie Significant-Gravitas#101 and Significant-Gravitas#286) ChatGPT seems to think that `evaluate_code` will actually run code, rather than just provide feedback. Since changing the phrasing to `analyze_code` I haven't seen the AI make this mistake. --------- Co-authored-by: Reinier van der Leer <[email protected]>

willcallender mentioned this issue Apr 14, 2023

Rename evaluate_code to analyze_code #1371

Merged

5 tasks

Qoyyuum added enhancement New feature or request Needs Benchmark This change is hard to test and requires a benchmark labels Apr 16, 2023

Pwuts closed this as not planned Won't fix, can't repro, duplicate, stale Apr 18, 2023

Pwuts removed the Needs Benchmark This change is hard to test and requires a benchmark label Apr 18, 2023

Pwuts reopened this Apr 18, 2023

Pwuts added this to AutoGPT development kanban Apr 18, 2023

Pwuts moved this to 📋 Backlog in AutoGPT development kanban Apr 18, 2023

Pwuts added the needs discussion To be discussed among maintainers label Apr 18, 2023

Pwuts changed the title ~~Execution of python code~~ Executing python code directly from AI output Apr 18, 2023

Boostrix mentioned this issue May 11, 2023

[DRAFT] allow python code only to be compiled/validated #4112

Closed

5 tasks

Boostrix mentioned this issue Jun 5, 2023

Add command for directly executing python code #4581

Merged

6 tasks

lc0rp removed this from AutoGPT development kanban Jun 13, 2023

github-actions bot added the Stale label Sep 6, 2023

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Sep 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Executing python code directly from AI output #286

Executing python code directly from AI output #286

luiz00martins commented Apr 6, 2023

drammen94 commented Apr 6, 2023

luiz00martins commented Apr 6, 2023

drammen94 commented Apr 6, 2023

luiz00martins commented Apr 6, 2023

yourfavtheo commented Apr 13, 2023

Pwuts commented Apr 18, 2023 •

edited

Loading

luiz00martins commented Apr 18, 2023

Pwuts commented Apr 18, 2023

luiz00martins commented Apr 18, 2023

Pwuts commented Apr 18, 2023 •

edited

Loading

Boostrix commented May 5, 2023

luiz00martins commented May 5, 2023 •

edited

Loading

Boostrix commented May 6, 2023 •

edited

Loading

github-actions bot commented Sep 17, 2023

Boostrix commented Oct 4, 2023

Executing python code directly from AI output #286

Executing python code directly from AI output #286

Comments

luiz00martins commented Apr 6, 2023

drammen94 commented Apr 6, 2023

luiz00martins commented Apr 6, 2023

drammen94 commented Apr 6, 2023

luiz00martins commented Apr 6, 2023

yourfavtheo commented Apr 13, 2023

Pwuts commented Apr 18, 2023 • edited Loading

luiz00martins commented Apr 18, 2023

Pwuts commented Apr 18, 2023

luiz00martins commented Apr 18, 2023

Pwuts commented Apr 18, 2023 • edited Loading

Boostrix commented May 5, 2023

luiz00martins commented May 5, 2023 • edited Loading

Boostrix commented May 6, 2023 • edited Loading

github-actions bot commented Sep 17, 2023

Boostrix commented Oct 4, 2023

Pwuts commented Apr 18, 2023 •

edited

Loading

Pwuts commented Apr 18, 2023 •

edited

Loading

luiz00martins commented May 5, 2023 •

edited

Loading

Boostrix commented May 6, 2023 •

edited

Loading