-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow users to specify the Docker image to use with Testbed #986
Conversation
…en default image if not specified).
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #986 +/- ##
=======================================
Coverage 26.63% 26.63%
=======================================
Files 28 28
Lines 3777 3777
Branches 858 858
=======================================
Hits 1006 1006
Misses 2700 2700
Partials 71 71
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
PR looks good. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
…t#986) * Allow users to specify the Docker image to use (or build a good AutoGen default image if not specified). * Added lmm and graphs to dockerfile
* outline * revision * eval function signature * first draft * link * format * example * cleanup * average * move figure * tldr * bold * bold * tag
…t#986) * Allow users to specify the Docker image to use (or build a good AutoGen default image if not specified). * Added lmm and graphs to dockerfile
Why are these changes needed?
When scenarios are run on the Testbed, they previously started from a clean "python:3.11" docker image, pulled from Docker Hub. Dependencies were then installed from the requirements file. This whole process would repeat for each task being evaluated.
This was mostly fine when running simple AutoGen scenarios, but has gotten to be somewhat slow as AutoGen dependencies have gotten heavier, or as new benchmarks require additional dependencies.
Importantly, if no image is specified, this code will build an appropriate one for testing. Specifically it will install AutoGen's dependencies -- but not AutoGen itself, as we often want to test a particular branch pulled and installed from Git. It also includes the 10 libraries listed here: https://learnpython.com/blog/most-popular-python-packages/ , namely:
NOTE: The list of libraries to install by default deserves careful consideration. On the one hand, we likely want to assume some basic core libraries are available (e.g., requests, urllib3). On the other hand, determining which libraries are important for a task is often part of solving the task itself (equivalent to automatic tool selection). The decision of which libraries to include should therefore be made very carefully and in advance of considering any benchmarks or problem sets. My current thinking on this is to only pre-install libraries that:
-are basic AutoGen requirements (and so would be installed anyways)
-are common enough to find themselves on lists provided by independent/neutral 3rd parties (as above)
If a given benchmark requires other libraries to run (e.g., sympy for MATH), then it is reasonable to create a Docker Image just for that benchmark.
Related issue number
Closes #985 and commend made by @kevin666aa in #792
Checks