Unable to deploy huggingface-llm 1.3.3

**Describe the bug**

I'd like to deploy mistral 0.2 LLM on sagemaker [it seems that we need to have the hugging face llm version 1.3.3](https://github.com/huggingface/text-generation-inference/issues/1342). For now the huggingface-llm is limited to some versions that does not include this version.

**To reproduce**

Run the following code : 

```python
#!/usr/bin/env python3
import json
import re

import boto3
import sagemaker
from sagemaker.huggingface import HuggingFaceModel, get_huggingface_llm_image_uri

try:
    role = sagemaker.get_execution_role()
except ValueError:
    iam = boto3.client("iam")
    role = iam.get_role(RoleName="exec-role")["Role"][
        "Arn"
    ]

# Hub Model configuration. https://huggingface.co/models
hub = {
    "HF_MODEL_ID": "mistralai/Mistral-7B-Instruct-v0.2",
    "SM_NUM_GPUS": json.dumps(1),
    # "HF_MODEL_QUANTIZE": "gptq",
    # 'HF_TASK':'question-answering',
    # Enable to have long input length, and override default sagemaker values
    # See https://github.com/facebookresearch/llama/issues/450#issuecomment-1645247796
    "MAX_INPUT_LENGTH": json.dumps(4095),
    "MAX_TOTAL_TOKENS": json.dumps(4096),
}

# Ensure endpoint name will be compliant for AWS
regex = r"[^\-a-zA-Z0-9]+"

compliant_name = re.sub(regex, "-", hub["HF_MODEL_ID"])

# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
    # Here we'd like to have at least 1.3.3
    # See https://github.com/huggingface/text-generation-inference/issues/1342
    image_uri=get_huggingface_llm_image_uri("huggingface", version="1.3.3"),
    env=hub,
    role=role,
    name=compliant_name,
)

# deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
    initial_instance_count=1,
    instance_type="ml.g5.2xlarge",
    container_startup_health_check_timeout=300,
    endpoint_name=compliant_name,
)
```

**Expected behavior**

Being able to deploy the huggingface llm version 1.3.3.

**Screenshots or logs**

![image](https://github.com/aws/sagemaker-python-sdk/assets/37327656/f2f22d1c-ac79-4da2-8b03-4c6332027c9b)


**System information**
A description of your system. Please provide:
- **SageMaker Python SDK version**: 2.200.1
- **Framework name (eg. PyTorch) or algorithm (eg. KMeans)**: huggingface-llm (PyTorch TGI inference)
- **Framework version**: 
- **Python version**: 3.10
- **CPU or GPU**: GPU
- **Custom Docker image (Y/N)**: N


**Additional context**

If it's a quick fix I could probably help for the PR if needed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unable to deploy huggingface-llm 1.3.3 #4332

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unable to deploy huggingface-llm 1.3.3 #4332

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions