Skip to content
This repository was archived by the owner on May 10, 2023. It is now read-only.

Conversation

@kuvaus
Copy link

@kuvaus kuvaus commented Apr 30, 2023

Title: Add compatibility with new sampling algorithms in llama.cpp

Description: This pull request addresses issue #200 (comment) by adding compatibility with new sampling algorithms in llama.cpp.

Changes:

Implemented temperature sampling with repetition penalty as an alternative to the previous llama_sample_top_p_top_k sampling method.

        // Temperature sampling with repetition_penalty
        llama_sample_repetition_penalty(
            d_ptr->ctx, &candidates_data,
            promptCtx.tokens.data() + promptCtx.n_ctx - promptCtx.repeat_last_n, promptCtx.repeat_last_n,
            promptCtx.repeat_penalty);
        llama_sample_top_k(d_ptr->ctx, &candidates_data, promptCtx.top_k);
        llama_sample_top_p(d_ptr->ctx, &candidates_data, promptCtx.top_p);
        llama_sample_temperature(d_ptr->ctx, &candidates_data, promptCtx.temp);
        llama_token id = llama_sample_token(d_ptr->ctx, &candidates_data);

@manyoso
Copy link
Collaborator

manyoso commented Apr 30, 2023

I will look at this, but will need to update the submodule at the same time otherwise this will break. But this helps a ton! Thanks @kuvaus !

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants