Skip to content

Add Code-Centric Interface to LightEval for Enhanced Usability #148

@adithya-s-k

Description

@adithya-s-k

Enhancing the functionality of LightEval to better accommodate coding workflows is paramount. The current approach relies heavily on command-line interaction (CLI), but a more code-centric interface would greatly benefit users.

Consider the following refinement:

# Install LightEval package
pip install lighteval

from lighteval import Evaluator, EvaluatorArguments

def configure_dataset():
    # Define dataset formatting and evaluation parameters here

# Initialize evaluator for custom dataset evaluations
evaluator = Evaluator(
    model=model,
    eval_dataset=dataset,
    metric="loglikelihood_acc",
    dataset_text_field=configure_dataset,
    args=EvaluatorArguments(
        # Specify additional arguments for evaluation configuration
        # e.g., batch size, evaluation steps, etc.
        # Example:
        batch_size=32,
        num_workers=4,
        ...
    ),
)

# Initiate the evaluation process
evaluator.evaluate()

# Display results and publish statistics to the Hugging Face Hub
evaluator.show_results()
evaluator.push_results()

This revised approach emphasizes a more structured and Pythonic usage of LightEval, with clear functions to define dataset formatting and evaluation specifics. Additionally, it leverages the EvaluatorArguments class to encapsulate additional evaluation configurations like batch size and number of workers. The usage of Evaluator and related methods is aligned with conventional Python programming paradigms, enhancing usability and integration within code-centric workflows.

if this is a feature you guys believe would be beneficial, I am eager to contribute to its development and enhancement.

@clefourrier @NathanHB

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions