Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ollama working with llama3 #91

Open
ejgutierrez74 opened this issue May 17, 2024 · 2 comments
Open

Ollama working with llama3 #91

ejgutierrez74 opened this issue May 17, 2024 · 2 comments

Comments

@ejgutierrez74
Copy link

Hi recently i made a introduction course about llama2/3:https://learn.deeplearning.ai/courses/prompt-engineering-with-llama-2/lesson/1/introduction

The teacher was one of the CEO of Llama3: https://www.linkedin.com/in/amitsangani?trk=public_post_feed-actor-name
So they teach that for using Llama you have to format prompts in multiturn conversation. Some kind of to mark every pair question user-answer from system, and add [inst] [/inst] in every prompt ( user question). An image of the idea:

imagen

Or perhaps cleaner:
imagen

The question is when you use ollama JS library to communicate to llama2/llama3 model via localhost... All this work is done in the background ? ( transforming prompts and responses in the format explained above)?

I was think of using some kind to build a promptchat with prompts and responses. As you can see i save prompts an array of the used or past prompts. And in responses the array with passed responses ( only the response message no all the response object).

But my question is this done automatically by ollama JS, is it useless to do it, would i increase speed or whatever using this fomat ?

Thanks in advance

export function get_prompt_chatv2(prompts, responses, verbose = false) {
    let prompt_chat = "";
    if (verbose) {
        console.log(`Longitud prompts:\n${prompts.length}\n`);
        console.log(`Longitud repuestas:\n${responses.length}\n`);
    }

    if (prompts.length !== responses.length + 1) {
        console.log("\n Error: Numero de prompts debe ser numero de responses + 1.");
        return prompt_chat;
    }

    let size = 0;
    for (let responseActual of responses) {
        prompt_chat += `<s>[INST] \n${prompts[size]}\n [/INST] \n${responseActual}\n </s>`;
        size = size + 1;
    }

    // Añadimos el ultimo prompt
    prompt_chat += `<s>[INST] \n${prompts[prompts.length - 1]}\n [/INST] \n`;

    return prompt_chat;
}
@ejgutierrez74
Copy link
Author

Any response or idea ??

@hopperelec
Copy link
Contributor

This isn't something handled by ollama-js, but instead by ollama itself using "templates". Whenever you create (or pull) a model using ollama, you provide a Modelfile which basically just describes all the defaults for that model. Part of that Modelfile is the template, which is exactly what you are describing. So, chances are that you are already using the correct template. However, as I mentioned, this is just a default, so you can override it if you wish. In ollama-js, you do this by setting the value of template when using ollama.generate({...}).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants