You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Prints the version of AutoGenBench from the command line, closing i1458
* Added autogenbench version to timestamp.txt
* Attempting to fix formatting.
* Add a gitignore for autogenbench
* Generalize to read all template dirs from Templates
* AutoGenBench logs telemetry when available.
* Remove spaces if present from template names.
* Bump version.
* Fixed formatting.
* Allow native warning to be skipped. Mount autogen repo in Docker if it can be found (experimental).
* Native execution now occurs in a venv.
* Bump version.
* Fixed a prompt escaping bug evident in GAIA task '6f37996b-2ac7-44b0-8e68-6d28256631b4'
* Updated all scenarios to use template discovery.
* Update with main version of runtime_logging.
---------
Co-authored-by: gagb <[email protected]>
sys.exit("--requirements is not compatible with --native. Exiting.")
583
656
584
-
choice=input(
585
-
'WARNING: Running natively, without Docker, not only poses the usual risks of executing arbitrary AI generated code on your machine, it also makes it impossible to ensure that each test starts from a known and consistent set of initial conditions. For example, if the agents spend time debugging and installing Python libraries to solve the task, then those libraries will be available to all other runs. In other words, earlier runs can influence later runs, leading to many confounds in testing.\n\nAre you absolutely sure you want to continue with native execution? Type "Yes" exactly, and in full, to proceed: '
657
+
sys.stderr.write(
658
+
"WARNING: Running natively, without Docker, not only poses the usual risks of executing arbitrary AI generated code on your machine, it also makes it impossible to ensure that each test starts from a known and consistent set of initial conditions. For example, if the agents spend time debugging and installing Python libraries to solve the task, then those libraries will be available to all other runs. In other words, earlier runs can influence later runs, leading to many confounds in testing.\n\n"
586
659
)
587
660
588
-
ifchoice.strip().lower() !="yes":
589
-
sys.exit("Received '"+choice+"'. Exiting.")
661
+
# Does an environment variable override the prompt?
Copy file name to clipboardExpand all lines: samples/tools/autogenbench/scenarios/GAIA/Templates/BasicTwoAgents/scenario.py
+5-3
Original file line number
Diff line number
Diff line change
@@ -7,6 +7,10 @@
7
7
testbed_utils.init()
8
8
##############################
9
9
10
+
# Read the prompt
11
+
PROMPT=""
12
+
withopen("prompt.txt", "rt") asfh:
13
+
PROMPT=fh.read().strip()
10
14
11
15
GAIA_SYSTEM_MESSAGE= (
12
16
"You are a helpful AI assistant, and today's date is "
@@ -48,9 +52,7 @@
48
52
)
49
53
50
54
filename="__FILE_NAME__".strip()
51
-
question="""
52
-
__PROMPT__
53
-
""".strip()
55
+
question=PROMPT
54
56
55
57
iflen(filename) >0:
56
58
question=f"Consider the file '{filename}', which can be read from the current working directory. If you need to read or write it, output python code in a code block (```python) to do so. {question}"
0 commit comments