Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a /detokenize endpoint to the example server #2802

Merged
merged 2 commits into from
Aug 26, 2023

Conversation

BruceMacD
Copy link
Contributor

Expose the ability to convert tokens to strings in the example server.

$ curl -X 'POST' \
 -d '{"content":"hello world"}' -H 'Content-Type: application/json' \
 'http://127.0.0.1:8080/tokenize'
{"tokens":[29871,22172,3186]}%                          

$ curl -X 'POST' \
 -d '{"tokens":[29871,22172,3186]}' -H 'Content-Type: application/json' \
 'http://127.0.0.1:8080/detokenize'
{"content":"  hello world"}% 

resolves #2801

@jhen0409
Copy link
Collaborator

It needs to fix the CI failure, other things looks good.

@BruceMacD
Copy link
Contributor Author

Thanks for taking a look, I removed the formatting issue so I believe the workflows should be good now.

@jhen0409 jhen0409 merged commit c1ac54b into ggerganov:master Aug 26, 2023
25 checks passed
mattgauf added a commit to mattgauf/llama.cpp that referenced this pull request Aug 26, 2023
* master: (773 commits)
  server : add `/detokenize` endpoint (ggerganov#2802)
  convert.py : advanced option (ggerganov#2753)
  llama : use Unicode Escape Sequence to replace encoded characters (ggerganov#2814)
  flake.nix : add rocm support and cleanup (ggerganov#2808)
  llama : move #includes out of _GNU_SOURCE conditional (ggerganov#2817)
  main : fix bug (penalize_nl=false doesn't work) + suppress warning on mingw (ggerganov#1528)
  llama : use std::abs in llama_sample_tail_free (ggerganov#2800)
  k-quants : remove unnecessary tensor shape restrictions (ggerganov#2811)
  Better perplexity for 2- and 3-bit quantization for LLaMA-v2-70B (ggerganov#2807)
  Fix HellaSwag (ggerganov#2805)
  flake : build llama.cpp on Intel with nix (ggerganov#2795)
  Handle null rope scaling value (ggerganov#2793)
  Fix spm whitespaces (ggerganov#2806)
  examples : skip unnecessary external lib in server README.md how-to (ggerganov#2804)
  llama : fix struct decl (ggerganov#2790)
  Faster perplexity computation (ggerganov#2786)
  llama : add llama_beam_search() (ggerganov#2267)
  convert.py : Get rope scale from HuggingFace models (ggerganov#2772)
  llama-bench : add model sizes (ggerganov#2771)
  convert.py : export rope freq_base when converting CodeLlama from an HF model (ggerganov#2773)
  ...
akawrykow pushed a commit to akawrykow/llama.cpp that referenced this pull request Aug 29, 2023
* Add a /detokenize endpoint to the example server

* remove trailing white-space
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create a /detokenize example server endpoint
2 participants