You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The teacher was one of the CEO of Llama3: https://www.linkedin.com/in/amitsangani?trk=public_post_feed-actor-name
So they teach that for using Llama you have to format prompts in multiturn conversation. Some kind of to mark every pair question user-answer from system, and add [inst] [/inst] in every prompt ( user question). An image of the idea:
Or perhaps cleaner:
The question is when you use ollama JS library to communicate to llama2/llama3 model via localhost... All this work is done in the background ? ( transforming prompts and responses in the format explained above)?
I was think of using some kind to build a promptchat with prompts and responses. As you can see i save prompts an array of the used or past prompts. And in responses the array with passed responses ( only the response message no all the response object).
But my question is this done automatically by ollama JS, is it useless to do it, would i increase speed or whatever using this fomat ?
Thanks in advance
export function get_prompt_chatv2(prompts, responses, verbose = false) {
let prompt_chat = "";
if (verbose) {
console.log(`Longitud prompts:\n${prompts.length}\n`);
console.log(`Longitud repuestas:\n${responses.length}\n`);
}
if (prompts.length !== responses.length + 1) {
console.log("\n Error: Numero de prompts debe ser numero de responses + 1.");
return prompt_chat;
}
let size = 0;
for (let responseActual of responses) {
prompt_chat += `<s>[INST] \n${prompts[size]}\n [/INST] \n${responseActual}\n </s>`;
size = size + 1;
}
// Añadimos el ultimo prompt
prompt_chat += `<s>[INST] \n${prompts[prompts.length - 1]}\n [/INST] \n`;
return prompt_chat;
}
The text was updated successfully, but these errors were encountered:
This isn't something handled by ollama-js, but instead by ollama itself using "templates". Whenever you create (or pull) a model using ollama, you provide a Modelfile which basically just describes all the defaults for that model. Part of that Modelfile is the template, which is exactly what you are describing. So, chances are that you are already using the correct template. However, as I mentioned, this is just a default, so you can override it if you wish. In ollama-js, you do this by setting the value of template when using ollama.generate({...}).
Hi recently i made a introduction course about llama2/3:https://learn.deeplearning.ai/courses/prompt-engineering-with-llama-2/lesson/1/introduction
The teacher was one of the CEO of Llama3: https://www.linkedin.com/in/amitsangani?trk=public_post_feed-actor-nameto mark every pair question user-answer from system, and add [inst] [/inst] in every prompt ( user question). An image of the idea:
So they teach that for using Llama you have to format prompts in multiturn conversation. Some kind of
Or perhaps cleaner:
The question is when you use ollama JS library to communicate to llama2/llama3 model via localhost... All this work is done in the background ? ( transforming prompts and responses in the format explained above)?
I was think of using some kind to build a promptchat with prompts and responses. As you can see i save prompts an array of the used or past prompts. And in responses the array with passed responses ( only the response message no all the response object).
But my question is this done automatically by ollama JS, is it useless to do it, would i increase speed or whatever using this fomat ?
Thanks in advance
The text was updated successfully, but these errors were encountered: