model-grading multiple questions at once #1396

datu925 · 2023-11-03T22:16:48Z

datu925
Nov 3, 2023

I'm new to evals so I may be missing something, but I was surprised to find that evals/elsuite/modelgraded/classify.py seem to tee up the grading model to grade a single question at a time, and that there wasn't an equivalent approach for getting multiple questions graded at once. I see that we can create our own subclasses and implement custom behavior, but before doing so, I wanted to makes sure I wasn't missing something obvious.

E.g. in a QA task, the input conversation that I want evaluated might have 10 question-answer pairs, and I want the grading model to return something like an array of grades for that sample.

Tangentially related, it would be nice to more easily run past answers through eval. So instead of querying the model, I can use responses that I've saved from a previous invocation.

For both of these, the motivation is reducing cost. My prompt has a lot of context, so it seems a little silly to pass it back in over and over again when the basic context is the same to the evaluation model. On the other hand, the developer/time costs of implementing some of these custom features probably outweigh that, but these seems like common use cases and I want to make sure I'm not overcomplicating this.

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

model-grading multiple questions at once #1396

{{title}}

Replies: 0 comments

Select a reply

model-grading multiple questions at once #1396

datu925 Nov 3, 2023

Replies: 0 comments

datu925
Nov 3, 2023