You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Updated tgi_model and added parameters for endpoint_model (#208)
* Added image url parameter
* Fixed up tgi model config
* Undid tgi available check
* Adjust tgi parameter names, and checked for attr existence
* Fixed task Id in argparse
* Removed obfuscation from private functions, to allow inheritance to override
* Updated tgi model to inherit from endpoint and just modify client calls
* Added option to specify model id in config for tgi model
* Added option to specify custom env vars
* Updated env vras
* Applied ruff format
* Added docs + readme
* Ruff format
An alternative to launching the evaluation locally is to serve the model on a TGI-compatible server/container and then run the evaluation by sending requests to the server. The command is the same as before, except you specify a path to a yaml config file (detailed below):
188
+
189
+
```shell
190
+
python run_evals_accelerate.py \
191
+
--model_config_path="/path/to/config/file"\
192
+
--tasks <task parameters> \
193
+
--output_dir output_dir
194
+
```
195
+
196
+
There are two types of configuration files that can be provided for running on the server:
197
+
198
+
1.[endpoint_model.yaml](./examples/model_configs/endpoint_model.yaml): This configuration allows you to launch the model using [HuggingFace's Inference Endpoints](https://huggingface.co/inference-endpoints/dedicated). You can specify in the configuration file all the relevant parameters, and then `lighteval` will automatically deploy the endpoint, run the evaluation, and finally delete the endpoint (unless you specify an endpoint that was already launched, in which case the endpoint won't be deleted afterwards).
199
+
200
+
2.[tgi_model.yaml](./examples/model_configs/tgi_model.yaml): This configuration lets you specify the URL of a model running in a TGI container, such as one deployed on HuggingFace's serverless inference.
201
+
202
+
Templates for these configurations can be found in [examples/model_configs](./examples/model_configs/).
203
+
185
204
### Evaluate a model on extended, community, or custom tasks.
186
205
187
206
Independently of the default tasks provided in `lighteval` that you will find in the `tasks_table.jsonl` file, you can use `lighteval` to evaluate models on tasks that require special processing (or have been added by the community). These tasks have their own evaluation suites and are defined as follows:
@@ -190,7 +209,6 @@ Independently of the default tasks provided in `lighteval` that you will find in
190
209
*`community`: tasks that have been added by the community. See the [`community_tasks`](./community_tasks) folder for examples.
191
210
*`custom`: tasks that are defined locally and not present in the core library. Use this suite if you want to experiment with designing a special metric or task.
192
211
193
-
194
212
For example, to run an extended task like `ifeval`, you can run:
Copy file name to clipboardExpand all lines: examples/model_configs/endpoint_model.yaml
+4-1Lines changed: 4 additions & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -5,7 +5,7 @@ model:
5
5
model: "meta-llama/Llama-2-7b-hf"
6
6
revision: "main"
7
7
dtype: "float16"# can be any of "awq", "eetq", "gptq", "4bit' or "8bit" (will use bitsandbytes), "bfloat16" or "float16"
8
-
reuse_existing: false # if true, ignore all params in instance
8
+
reuse_existing: false # if true, ignore all params in instance, and don't delete the endpoint after evaluation
9
9
instance:
10
10
accelerator: "gpu"
11
11
region: "eu-west-1"
@@ -15,5 +15,8 @@ model:
15
15
framework: "pytorch"
16
16
endpoint_type: "protected"
17
17
namespace: null # The namespace under which to launch the endopint. Defaults to the current user's namespace
18
+
image_url: null # Optionally specify the docker image to use when launching the endpoint model. E.g., launching models with later releases of the TGI container with support for newer models.
19
+
env_vars:
20
+
null # Optional environment variables to include when launching the endpoint. e.g., `MAX_INPUT_LENGTH: 2048`
0 commit comments