Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Metrics for Evals should be Optional; Currently Tool Quality is hard set #242

Open
agutta opened this issue Oct 1, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@agutta
Copy link

agutta commented Oct 1, 2024

Calling vertex eval set is forcing to add a tool call.

Expected Behavior

If the playbook we are testing dont have a Tool call it should not be mandatory to add a tool call in the input sheet.

Current Behavior

Need to add a Tool call in the input sheet otherwise this line throws an error.
data.append_test_results_to_sheets(eval_results, sheet_name, summary_tab="reporting")

ValueError: Out of range float values are not JSON compliant

During handling of the above exception, another exception occurred:

InvalidJSONError                          Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/requests/models.py](https://localhost:8080/#) in prepare_body(self, data, files, json)
    510                 body = complexjson.dumps(json, allow_nan=False)
    511             except ValueError as ve:
--> 512                 raise InvalidJSONError(ve, request=self)
    513 
    514             if not isinstance(body, bytes):

InvalidJSONError: Out of range float values are not JSON compliant

===============================================================

Trying to removing the tool_call_quality does not help getting a different error
evals = Evaluations(agent_id, metrics=["response_similarity", "tool_call_quality"])

KeyError: 'tool_name_match'

The above exception was the direct cause of the following exception:

KeyError                                  Traceback (most recent call last)
4 frames
/usr/local/lib/python3.10/dist-packages/pandas/core/indexes/base.py in get_loc(self, key)
   3810             ):
   3811                 raise InvalidIndexError(key)
-> 3812             raise KeyError(key) from err
   3813         except TypeError:
   3814             # If we have a listlike key, _check_indexing_error will raise

KeyError: 'tool_name_match'
@agutta agutta added the bug Something isn't working label Oct 1, 2024
@kmaphoenix kmaphoenix changed the title [BUG] <Issue Summary Here> [BUG] Metrics for Evals should be Optional; Currently Tool Quality is hard set Oct 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant