You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to create my own Eval to evaluate my model, but it seems like I can't even run test-match.
I am a Python newbie so I could be missing something but it looks like Evals might be in a bad state? (I did notice a lot of recent changes related to dependencies and the area where things are failing, but need a 2nd pair of eyes to confirm there is an issue please =)
Note: Python 3.11.6, on Mac OS 14 (Sanoma) but I don't think this is the source of the problem
Note2: I don't expect this to work since I have to setup my key yet, but this is just the reduce test case to replicate the issue.
Here is a dump of the stack trace:
Building wheels for collected packages: evals
Building editable for evals (pyproject.toml) ... done
Created wheel for evals: filename=evals-1.0.3.post1-0.editable-py3-none-any.whl size=4945 sha256=3393f46e0147a629a88a9d4de137e6c0d3038b1591fa3b0beccead0f2bc77889
Stored in directory: /private/var/folders/n5/q9nl63rj51j85dz54l3k2j6m0000gn/T/pip-ephem-wheel-cache-r614x0i6/wheels/9b/27/20/476deee3ab3207f3e0943d8fc1b8d844ba71bf765d5c7afcbe
Successfully built evals
Installing collected packages: evals
Attempting uninstall: evals
Found existing installation: evals 1.0.3.post1
Uninstalling evals-1.0.3.post1:
Successfully uninstalled evals-1.0.3.post1
Successfully installed evals-1.0.3.post1
[2023-11-15 09:13:50,996] [registry.py:254] Loading registry from /private/tmp/evals/evals/registry/evals
[2023-11-15 09:13:51,296] [registry.py:254] Loading registry from /Users/john/.evals/evals
[2023-11-15 09:13:51,297] [oaieval.py:189] Run started: 231115171351U6WEASBU
[2023-11-15 09:13:51,299] [eval.py:36] Evaluating 3 samples
[2023-11-15 09:13:51,306] [eval.py:144] Running in threaded mode with 10 threads!
0%| | 0/3 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/opt/homebrew/bin/oaieval", line 8, in <module>
sys.exit(main())
^^^^^^
File "/private/tmp/evals/evals/cli/oaieval.py", line 274, in main
run(args)
File "/private/tmp/evals/evals/cli/oaieval.py", line 223, in run
result = eval.run(recorder)
^^^^^^^^^^^^^^^^^^
File "/private/tmp/evals/evals/elsuite/basic/match.py", line 60, in run
self.eval_all_samples(recorder, samples)
File "/private/tmp/evals/evals/eval.py", line 146, in eval_all_samples
idx_and_result = list(tqdm(iter, total=len(work_items), disable=not show_progress))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.11/site-packages/tqdm/std.py", line 1182, in __iter__
for obj in iterable:
File "/opt/homebrew/Cellar/[email protected]/3.11.6_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/pool.py", line 873, in next
raise value
File "/opt/homebrew/Cellar/[email protected]/3.11.6_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
^^^^^^^^^^^^^^^^^^^
File "/private/tmp/evals/evals/eval.py", line 137, in eval_sample
return idx, self.eval_sample(sample, rng)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/private/tmp/evals/evals/elsuite/basic/match.py", line 46, in eval_sample
result = self.completion_fn(
^^^^^^^^^^^^^^^^^^^
File "/private/tmp/evals/evals/completion_fns/openai.py", line 129, in __call__
result = openai_chat_completion_create_retrying(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.11/site-packages/backoff/_sync.py", line 105, in retry
ret = target(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^
File "/private/tmp/evals/evals/utils/api_utils.py", line 69, in openai_chat_completion_create_retrying
result = request_with_timeout(openai.ChatCompletion.create, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/private/tmp/evals/evals/utils/api_utils.py", line 46, in request_with_timeout
result = future.result(timeout=timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Cellar/[email protected]/3.11.6_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 449, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/Cellar/[email protected]/3.11.6_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
File "/opt/homebrew/Cellar/[email protected]/3.11.6_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.11/site-packages/openai/api_resources/chat_completion.py", line 25, in create
return super().create(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.11/site-packages/openai/api_resources/abstract/engine_api_resource.py", line 151, in create
) = cls.__prepare_create_request(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.11/site-packages/openai/api_resources/abstract/engine_api_resource.py", line 108, in __prepare_create_request
requestor = api_requestor.APIRequestor(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.11/site-packages/openai/api_requestor.py", line 139, in __init__
self.api_key = key or util.default_api_key()
^^^^^^^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.11/site-packages/openai/util.py", line 186, in default_api_key
raise openai.error.AuthenticationError(
openai.error.AuthenticationError: No API key provided. You can set your API key in code using 'openai.api_key = <API-KEY>', or you can set the environment variable OPENAI_API_KEY=<API-KEY>). If your API key is stored in a file, you can point the openai module at it with 'openai.api_key_path = <PATH>'. You can generate API keys in the OpenAI web interface. See https://platform.openai.com/account/api-keys for details.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I am trying to create my own Eval to evaluate my model, but it seems like I can't even run
test-match
.I am a Python newbie so I could be missing something but it looks like Evals might be in a bad state? (I did notice a lot of recent changes related to dependencies and the area where things are failing, but need a 2nd pair of eyes to confirm there is an issue please =)
Steps to reproduce:
Note: Python 3.11.6, on Mac OS 14 (Sanoma) but I don't think this is the source of the problem
Note2: I don't expect this to work since I have to setup my key yet, but this is just the reduce test case to replicate the issue.
Here is a dump of the stack trace:
Beta Was this translation helpful? Give feedback.
All reactions