Seems like OpenAI Evals is in a bad state? #1414

tregoning · 2023-11-15T17:18:58Z

tregoning
Nov 15, 2023

I am trying to create my own Eval to evaluate my model, but it seems like I can't even run test-match.

I am a Python newbie so I could be missing something but it looks like Evals might be in a bad state? (I did notice a lot of recent changes related to dependencies and the area where things are failing, but need a 2nd pair of eyes to confirm there is an issue please =)

Steps to reproduce:

git clone [email protected]:openai/evals.git
cd evals
git lfs fetch --all
git lfs pull
pip3 install -e .
oaieval gpt-3.5-turbo test-match

Note: Python 3.11.6, on Mac OS 14 (Sanoma) but I don't think this is the source of the problem
Note2: I don't expect this to work since I have to setup my key yet, but this is just the reduce test case to replicate the issue.

Here is a dump of the stack trace:

Building wheels for collected packages: evals
  Building editable for evals (pyproject.toml) ... done
  Created wheel for evals: filename=evals-1.0.3.post1-0.editable-py3-none-any.whl size=4945 sha256=3393f46e0147a629a88a9d4de137e6c0d3038b1591fa3b0beccead0f2bc77889
  Stored in directory: /private/var/folders/n5/q9nl63rj51j85dz54l3k2j6m0000gn/T/pip-ephem-wheel-cache-r614x0i6/wheels/9b/27/20/476deee3ab3207f3e0943d8fc1b8d844ba71bf765d5c7afcbe
Successfully built evals
Installing collected packages: evals
  Attempting uninstall: evals
    Found existing installation: evals 1.0.3.post1
    Uninstalling evals-1.0.3.post1:
      Successfully uninstalled evals-1.0.3.post1
Successfully installed evals-1.0.3.post1
[2023-11-15 09:13:50,996] [registry.py:254] Loading registry from /private/tmp/evals/evals/registry/evals
[2023-11-15 09:13:51,296] [registry.py:254] Loading registry from /Users/john/.evals/evals
[2023-11-15 09:13:51,297] [oaieval.py:189] Run started: 231115171351U6WEASBU
[2023-11-15 09:13:51,299] [eval.py:36] Evaluating 3 samples
[2023-11-15 09:13:51,306] [eval.py:144] Running in threaded mode with 10 threads!
  0%|                                                                                                | 0/3 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/opt/homebrew/bin/oaieval", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/private/tmp/evals/evals/cli/oaieval.py", line 274, in main
    run(args)
  File "/private/tmp/evals/evals/cli/oaieval.py", line 223, in run
    result = eval.run(recorder)
             ^^^^^^^^^^^^^^^^^^
  File "/private/tmp/evals/evals/elsuite/basic/match.py", line 60, in run
    self.eval_all_samples(recorder, samples)
  File "/private/tmp/evals/evals/eval.py", line 146, in eval_all_samples
    idx_and_result = list(tqdm(iter, total=len(work_items), disable=not show_progress))
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/tqdm/std.py", line 1182, in __iter__
    for obj in iterable:
  File "/opt/homebrew/Cellar/[email protected]/3.11.6_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/pool.py", line 873, in next
    raise value
  File "/opt/homebrew/Cellar/[email protected]/3.11.6_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
                    ^^^^^^^^^^^^^^^^^^^
  File "/private/tmp/evals/evals/eval.py", line 137, in eval_sample
    return idx, self.eval_sample(sample, rng)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/private/tmp/evals/evals/elsuite/basic/match.py", line 46, in eval_sample
    result = self.completion_fn(
             ^^^^^^^^^^^^^^^^^^^
  File "/private/tmp/evals/evals/completion_fns/openai.py", line 129, in __call__
    result = openai_chat_completion_create_retrying(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/backoff/_sync.py", line 105, in retry
    ret = target(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^
  File "/private/tmp/evals/evals/utils/api_utils.py", line 69, in openai_chat_completion_create_retrying
    result = request_with_timeout(openai.ChatCompletion.create, *args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/private/tmp/evals/evals/utils/api_utils.py", line 46, in request_with_timeout
    result = future.result(timeout=timeout)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.11.6_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.11.6_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/opt/homebrew/Cellar/[email protected]/3.11.6_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/openai/api_resources/chat_completion.py", line 25, in create
    return super().create(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/openai/api_resources/abstract/engine_api_resource.py", line 151, in create
    ) = cls.__prepare_create_request(
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/openai/api_resources/abstract/engine_api_resource.py", line 108, in __prepare_create_request
    requestor = api_requestor.APIRequestor(
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/openai/api_requestor.py", line 139, in __init__
    self.api_key = key or util.default_api_key()
                          ^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/openai/util.py", line 186, in default_api_key
    raise openai.error.AuthenticationError(
openai.error.AuthenticationError: No API key provided. You can set your API key in code using 'openai.api_key = <API-KEY>', or you can set the environment variable OPENAI_API_KEY=<API-KEY>). If your API key is stored in a file, you can point the openai module at it with 'openai.api_key_path = <PATH>'. You can generate API keys in the OpenAI web interface. See https://platform.openai.com/account/api-keys for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Seems like OpenAI Evals is in a bad state? #1414

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Seems like OpenAI Evals is in a bad state? #1414

tregoning Nov 15, 2023

Replies: 0 comments

tregoning
Nov 15, 2023