Allow users to specify the Docker image to use with Testbed #986

afourney · 2023-12-14T22:30:13Z

Why are these changes needed?

When scenarios are run on the Testbed, they previously started from a clean "python:3.11" docker image, pulled from Docker Hub. Dependencies were then installed from the requirements file. This whole process would repeat for each task being evaluated.

This was mostly fine when running simple AutoGen scenarios, but has gotten to be somewhat slow as AutoGen dependencies have gotten heavier, or as new benchmarks require additional dependencies.

Importantly, if no image is specified, this code will build an appropriate one for testing. Specifically it will install AutoGen's dependencies -- but not AutoGen itself, as we often want to test a particular branch pulled and installed from Git. It also includes the 10 libraries listed here: https://learnpython.com/blog/most-popular-python-packages/ , namely:

numpy
pandas
matplotlib
seaborn
scikit-learn
requests
urllib3
nltk
pillow
pytest

NOTE: The list of libraries to install by default deserves careful consideration. On the one hand, we likely want to assume some basic core libraries are available (e.g., requests, urllib3). On the other hand, determining which libraries are important for a task is often part of solving the task itself (equivalent to automatic tool selection). The decision of which libraries to include should therefore be made very carefully and in advance of considering any benchmarks or problem sets. My current thinking on this is to only pre-install libraries that:

-are basic AutoGen requirements (and so would be installed anyways)
-are common enough to find themselves on lists provided by independent/neutral 3rd parties (as above)

If a given benchmark requires other libraries to run (e.g., sympy for MATH), then it is reasonable to create a Docker Image just for that benchmark.

Related issue number

Closes #985 and commend made by @kevin666aa in #792

Checks

I've included any doc changes needed for https://microsoft.github.io/autogen/. See https://microsoft.github.io/autogen/docs/Contribute#documentation to build and test documentation locally.
I've added tests (if relevant) corresponding to the changes introduced in this PR.
I've made sure all auto checks have passed.

…en default image if not specified).

codecov-commenter · 2023-12-14T22:33:37Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (3a768c3) 26.63% compared to head (7962d07) 26.63%.
Report is 2 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #986   +/-   ##
=======================================
  Coverage   26.63%   26.63%           
=======================================
  Files          28       28           
  Lines        3777     3777           
  Branches      858      858           
=======================================
  Hits         1006     1006           
  Misses       2700     2700           
  Partials       71       71

Flag	Coverage Δ
unittests	`26.58% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

victordibia · 2023-12-14T23:05:19Z

PR looks good.
I think having known useful libraries will improve test bed dev experience. Also devs can update the dockerfile as needed.

samples/tools/testbed/Dockerfile

yiranwu0

LGTM!

…t#986) * Allow users to specify the Docker image to use (or build a good AutoGen default image if not specified). * Added lmm and graphs to dockerfile

* outline * revision * eval function signature * first draft * link * format * example * cleanup * average * move figure * tldr * bold * bold * tag

…t#986) * Allow users to specify the Docker image to use (or build a good AutoGen default image if not specified). * Added lmm and graphs to dockerfile

Allow users to specify the Docker image to use (or build a good AutoG…

3396254

…en default image if not specified).

afourney added the proj-autogenbench Issues related to AutoGenBench. label Dec 14, 2023

afourney requested review from qingyun-wu, yiranwu0, LeoLjl and a team December 14, 2023 22:30

afourney self-assigned this Dec 14, 2023

LeoLjl approved these changes Dec 15, 2023

View reviewed changes

samples/tools/testbed/Dockerfile Outdated Show resolved Hide resolved

Added lmm and graphs to dockerfile

7962d07

yiranwu0 approved these changes Dec 15, 2023

View reviewed changes

sonichi added this pull request to the merge queue Dec 15, 2023

Merged via the queue into main with commit 4dcb415 Dec 15, 2023
16 checks passed

sonichi deleted the testbed_docker branch January 1, 2024 17:36

afourney mentioned this pull request Jan 2, 2024

guide on the usage of docker #1111

Merged

3 tasks

whiskyboy pushed a commit to whiskyboy/autogen that referenced this pull request Apr 17, 2024

Blog post for LLM tuning (microsoft#986)

7b26f45

* outline * revision * eval function signature * first draft * link * format * example * cleanup * average * move figure * tldr * bold * bold * tag

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow users to specify the Docker image to use with Testbed #986

Allow users to specify the Docker image to use with Testbed #986

afourney commented Dec 14, 2023 •

edited

Loading

codecov-commenter commented Dec 14, 2023 •

edited

Loading

victordibia commented Dec 14, 2023

yiranwu0 left a comment

Allow users to specify the Docker image to use with Testbed #986

Allow users to specify the Docker image to use with Testbed #986

Conversation

afourney commented Dec 14, 2023 • edited Loading

Why are these changes needed?

Related issue number

Checks

codecov-commenter commented Dec 14, 2023 • edited Loading

Codecov Report

victordibia commented Dec 14, 2023

yiranwu0 left a comment

Choose a reason for hiding this comment

afourney commented Dec 14, 2023 •

edited

Loading

codecov-commenter commented Dec 14, 2023 •

edited

Loading