Add new model NECTEC (nectec/OpenThaiLLM-Prebuilt-7B, nectec/Pathumma-llm-text-1.0.0) #3188

JackJessada · 2024-11-28T12:56:45Z

We are pleased to announce the release of two new Thai language models by the National Electronics and Computer Technology Center (NECTEC): OpenThaiLLM-Prebuilt-7B and Pathumma-llm-text-1.0.0. Both models have been successfully tested on a local leaderboard, specifically for the ThaiExam benchmark. Their respective scores are:

OpenThaiLLM-Prebuilt-7B: 0.594
Pathumma-llm-text-1.0.0: 0.566
We would like to request the integration of these models into the HELM leaderboard under the ThaiExam benchmark. Below, we have provided the necessary configuration files for deployment and metadata:

model_deployments.yaml

# NECTEC
  - name: huggingface/Pathumma-llm-text-1.0.0
    model_name: nectec/Pathumma-llm-text-1.0.0
    tokenizer_name: nectec/Pathumma-llm-text-1.0.0
    max_sequence_length: 8192
    client_spec:
      class_name: "helm.clients.huggingface_client.HuggingFaceClient"

  - name: huggingface/OpenThaiLLM-Prebuilt-7B
    model_name: nectec/OpenThaiLLM-Prebuilt-7B
    tokenizer_name: nectec/OpenThaiLLM-Prebuilt-7B
    max_sequence_length: 4096
    client_spec:
      class_name: "helm.clients.huggingface_client.HuggingFaceClient"

model_metadata.yaml

  # NECTEC
  - name: nectec/Pathumma-llm-text-1.0.0
    display_name: Pathumma-llm-text-1.0.0 (7B)
    description: Pathumma-llm-text-1.0.0 (7B) is a instruction model from  OpenThaiLLM-Prebuilt-7B ([blog](https://medium.com/nectec/pathummallm-v-1-0-0-release-6a098ddfe276))
    creator_organization_name: nectec
    access: open
    num_parameters: 72000000000
    release_date: 2024-10-28
    tags: [TEXT_MODEL_TAG, PARTIAL_FUNCTIONALITY_TEXT_MODEL_TAG, INSTRUCTION_FOLLOWING_MODEL_TAG]
 
  - name: nectec/OpenThaiLLM-Prebuilt-7B
    display_name: OpenThaiLLM-Prebuilt-7B (7B)
    description: OpenThaiLLM-Prebuilt-7B (7B) is a pretrained Thai large language model with 7 billion parameters based on Qwen2.5-7B.
    creator_organization_name: nectec
    access: open
    num_parameters: 72000000000
    release_date: 2024-10-28
    tags: [TEXT_MODEL_TAG, PARTIAL_FUNCTIONALITY_TEXT_MODEL_TAG]

tokenizer_configs.yaml

# OpenthaiLLM-Prebuild
  - name: nectec/OpenThaiLLM-Prebuilt-7B
    tokenizer_spec:
      class_name: "helm.tokenizers.huggingface_tokenizer.HuggingFaceTokenizer"
    end_of_text_token: "<|im_end|>"
    prefix_token: ""
  # Nectec
  - name: nectec/Pathumma-llm-text-1.0.0
    tokenizer_spec:
      class_name: "helm.tokenizers.huggingface_tokenizer.HuggingFaceTokenizer"
    end_of_text_token: "<|im_end|>"
    prefix_token: "<|im_start|>"

yifanmai · 2024-12-02T22:44:10Z

This looks great, thank you! Could you open a pull request with these changes?

JackJessada · 2024-12-04T00:14:21Z

Thank for Response!! I already opened pull request you can check out here #3197 (comment)

yifanmai · 2024-12-07T00:42:30Z

Great, thanks! The pull requests has been merged. I will update the leaderboard with results for these models in the next few weeks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add new model NECTEC (nectec/OpenThaiLLM-Prebuilt-7B, nectec/Pathumma-llm-text-1.0.0) #3188

Add new model NECTEC (nectec/OpenThaiLLM-Prebuilt-7B, nectec/Pathumma-llm-text-1.0.0) #3188

JackJessada commented Nov 28, 2024 •

edited

Loading

yifanmai commented Dec 2, 2024

JackJessada commented Dec 4, 2024

yifanmai commented Dec 7, 2024

Add new model NECTEC (nectec/OpenThaiLLM-Prebuilt-7B, nectec/Pathumma-llm-text-1.0.0) #3188

Add new model NECTEC (nectec/OpenThaiLLM-Prebuilt-7B, nectec/Pathumma-llm-text-1.0.0) #3188

Comments

JackJessada commented Nov 28, 2024 • edited Loading

yifanmai commented Dec 2, 2024

JackJessada commented Dec 4, 2024

yifanmai commented Dec 7, 2024

JackJessada commented Nov 28, 2024 •

edited

Loading