-
Notifications
You must be signed in to change notification settings - Fork 363
Test inference endpoint model config parsing from path #434
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
albertvillanova
merged 12 commits into
huggingface:main
from
albertvillanova:test-inference-endpoint-model-config-from-path
Dec 12, 2024
Merged
Changes from all commits
Commits
Show all changes
12 commits
Select commit
Hold shift + click to select a range
47f6a6e
Add example model config for existing endpoint
albertvillanova 10d93aa
Test InferenceEndpointModelConfig.from_path
albertvillanova 4c8e01a
Comment default main branch in example
albertvillanova b0c82b8
Fix typo
albertvillanova e3ffecc
Delete unused add_special_tokens param in endpoint example config
albertvillanova 30a7928
Fix typo
albertvillanova 1f5b589
Implement InferenceEndpointModelConfig.from_path
albertvillanova 6ac7667
Use InferenceEndpointModelConfig.from_path
albertvillanova e9ff0c6
Refactor InferenceEndpointModelConfig.from_path
albertvillanova 8f8927c
Align docs
albertvillanova ade22cf
Merge branch 'main' into test-inference-endpoint-model-config-from-path
albertvillanova da26699
Merge branch 'main' into test-inference-endpoint-model-config-from-path
clefourrier File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
model: | ||
base_params: | ||
# Pass either model_name, or endpoint_name and true reuse_existing | ||
endpoint_name: "llama-2-7B-lighteval" # needs to be lower case without special characters | ||
reuse_existing: true # defaults to false; if true, ignore all params in instance, and don't delete the endpoint after evaluation | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -103,12 +103,21 @@ def __post_init__(self): | |
# xor operator, one is None but not the other | ||
if (self.instance_size is None) ^ (self.instance_type is None): | ||
raise ValueError( | ||
"When creating an inference endpoint, you need to specify explicitely both instance_type and instance_size, or none of them for autoscaling." | ||
"When creating an inference endpoint, you need to specify explicitly both instance_type and instance_size, or none of them for autoscaling." | ||
) | ||
|
||
if not (self.endpoint_name is None) ^ int(self.model_name is None): | ||
raise ValueError("You need to set either endpoint_name or model_name (but not both).") | ||
|
||
@classmethod | ||
def from_path(cls, path: str) -> "InferenceEndpointModelConfig": | ||
import yaml | ||
|
||
with open(path, "r") as f: | ||
config = yaml.safe_load(f)["model"] | ||
config["base_params"]["model_dtype"] = config["base_params"].pop("dtype", None) | ||
return cls(**config["base_params"], **config.get("instance", {})) | ||
Comment on lines
+118
to
+119
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nice and much cleaner! We might have to add this to all model types |
||
|
||
def get_dtype_args(self) -> Dict[str, str]: | ||
if self.model_dtype is None: | ||
return {} | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,85 @@ | ||
# MIT License | ||
|
||
# Copyright (c) 2024 The HuggingFace Team | ||
|
||
# Permission is hereby granted, free of charge, to any person obtaining a copy | ||
# of this software and associated documentation files (the "Software"), to deal | ||
# in the Software without restriction, including without limitation the rights | ||
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | ||
# copies of the Software, and to permit persons to whom the Software is | ||
# furnished to do so, subject to the following conditions: | ||
|
||
# The above copyright notice and this permission notice shall be included in all | ||
# copies or substantial portions of the Software. | ||
|
||
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | ||
# SOFTWARE. | ||
|
||
import pytest | ||
|
||
from lighteval.models.endpoints.endpoint_model import InferenceEndpointModelConfig | ||
|
||
|
||
# "examples/model_configs/endpoint_model.yaml" | ||
|
||
|
||
class TestInferenceEndpointModelConfig: | ||
@pytest.mark.parametrize( | ||
"config_path, expected_config", | ||
[ | ||
( | ||
"examples/model_configs/endpoint_model.yaml", | ||
{ | ||
"model_name": "meta-llama/Llama-2-7b-hf", | ||
"revision": "main", | ||
"model_dtype": "float16", | ||
"endpoint_name": None, | ||
"reuse_existing": False, | ||
"accelerator": "gpu", | ||
"region": "eu-west-1", | ||
"vendor": "aws", | ||
"instance_type": "nvidia-a10g", | ||
"instance_size": "x1", | ||
"framework": "pytorch", | ||
"endpoint_type": "protected", | ||
"namespace": None, | ||
"image_url": None, | ||
"env_vars": None, | ||
}, | ||
), | ||
( | ||
"examples/model_configs/endpoint_model_lite.yaml", | ||
{ | ||
"model_name": "meta-llama/Llama-3.1-8B-Instruct", | ||
# Defaults: | ||
"revision": "main", | ||
"model_dtype": None, | ||
"endpoint_name": None, | ||
"reuse_existing": False, | ||
"accelerator": "gpu", | ||
"region": "us-east-1", | ||
"vendor": "aws", | ||
"instance_type": None, | ||
"instance_size": None, | ||
"framework": "pytorch", | ||
"endpoint_type": "protected", | ||
"namespace": None, | ||
"image_url": None, | ||
"env_vars": None, | ||
}, | ||
), | ||
( | ||
"examples/model_configs/endpoint_model_reuse_existing.yaml", | ||
{"endpoint_name": "llama-2-7B-lighteval", "reuse_existing": True}, | ||
), | ||
], | ||
) | ||
def test_from_path(self, config_path, expected_config): | ||
config = InferenceEndpointModelConfig.from_path(config_path) | ||
for key, value in expected_config.items(): | ||
assert getattr(config, key) == value |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"and does not"