Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/source/quicktour.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -148,7 +148,7 @@ accelerate).
### VLLM

- **pretrained** (str): HuggingFace Hub model ID name or the path to a pre-trained model to load.
- **gpu_memory_utilisation** (float): The fraction of GPU memory to use.
- **gpu_memory_utilization** (float): The fraction of GPU memory to use.
- **batch_size** (int): The batch size for model training.
- **revision** (str): The revision of the model.
- **dtype** (str, None): The data type to use for the model.
Expand Down
2 changes: 1 addition & 1 deletion docs/source/use-vllm-as-backend.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -57,4 +57,4 @@ model: # Model specific parameters

> [!WARNING]
> In the case of OOM issues, you might need to reduce the context size of the
> model as well as reduce the `gpu_memory_utilisation` parameter.
> model as well as reduce the `gpu_memory_utilization` parameter.
4 changes: 2 additions & 2 deletions src/lighteval/models/vllm/vllm_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@
@dataclass
class VLLMModelConfig:
pretrained: str
gpu_memory_utilisation: float = 0.9 # lower this if you are running out of memory
gpu_memory_utilization: float = 0.9 # lower this if you are running out of memory
revision: str = "main" # revision of the model
dtype: str | None = None
tensor_parallel_size: int = 1 # how many GPUs to use for tensor parallelism
Expand Down Expand Up @@ -174,7 +174,7 @@ def _create_auto_model(self, config: VLLMModelConfig, env_config: EnvConfig) ->
"""
self.model_args = {
"model": config.pretrained,
"gpu_memory_utilization": float(config.gpu_memory_utilisation),
"gpu_memory_utilization": float(config.gpu_memory_utilization),
"revision": config.revision + (f"/{config.subfolder}" if config.subfolder is not None else ""),
"dtype": config.dtype,
"trust_remote_code": config.trust_remote_code,
Expand Down
Loading