[Bug] request parameter `min_new_tokens` is not used #2678

Huarong · 2024-10-29T08:04:16Z

Checklist

1. I have searched related issues but cannot get the expected help.
2. The bug has not been fixed in the latest version.
3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.

Describe the bug

The min_new_tokens int server log is not the same as the requests.

Reproduction

server
backend: turbomind

request

curl --location --request POST 'http://xxx:xxx/v1/completions' \
--header 'Content-Type: application/json' \
--data-raw '{
    "model": "xxx",
    "prompt": "hi",
    "max_tokens": 128,
    "stream": false,
    "temperature": 0.1,
    "top_p": 0.9,
    "repetition_penalty": 1.1,
    "min_new_tokens": 4,
    "user": "user-1234"
}'

got log

expected
min_new_tokens to be 4

Environment

sys.platform: linux
Python: 3.10.12 (main, Sep 11 2024, 15:47:36) [GCC 11.4.0]
CUDA available: False
MUSA available: False
numpy_random_seed: 2147483648
GCC: x86_64-linux-gnu-gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
PyTorch: 2.3.0+cu121
PyTorch compiling details: PyTorch built with:
  - GCC 9.3
  - C++ Version: 201703
  - Intel(R) oneAPI Math Kernel Library Version 2022.2-Product Build 20220804 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v3.3.6 (Git Hash 86e6af5974177e513fd3fee58425e1063e7f1361)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX512
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=12.1, CUDNN_VERSION=8.9.2, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=pedantic -Wno-error=old-style-cast -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=2.3.0, USE_CUDA=ON, USE_CUDNN=ON, USE_CUSPARSELT=1, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_GLOO=ON, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=1, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, USE_ROCM_KERNEL_ASSERT=OFF,

TorchVision: 0.18.0+cu121
LMDeploy: 0.6.1+2e49fc3
transformers: 4.45.1
gradio: 4.44.0
fastapi: 0.115.0
pydantic: 2.9.2
triton: 2.3.0

Error traceback

No response

The text was updated successfully, but these errors were encountered:

AllentDan · 2024-10-29T10:26:16Z

May try #2681

AllentDan · 2024-10-29T10:46:56Z

We did not plan to provide min_new_tokens to /v1/completions endpoint. Besides, as for parameter not included in OpenAI, you should put it into extra_body=dict(min_new_tokens=xxx)

Huarong · 2024-10-30T03:56:26Z

May try #2681

Thank you. Look forward it to be merged.

lvhan028 assigned AllentDan Oct 29, 2024

Huarong closed this as completed Oct 30, 2024

AllentDan mentioned this issue Oct 30, 2024

[Bug] min_p from request is not used #2682

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] request parameter `min_new_tokens` is not used #2678

[Bug] request parameter `min_new_tokens` is not used #2678

Huarong commented Oct 29, 2024 •

edited

Loading

AllentDan commented Oct 29, 2024

AllentDan commented Oct 29, 2024

Huarong commented Oct 30, 2024

[Bug] request parameter min_new_tokens is not used #2678

[Bug] request parameter min_new_tokens is not used #2678

Comments

Huarong commented Oct 29, 2024 • edited Loading

Checklist

Describe the bug

Reproduction

Environment

Error traceback

AllentDan commented Oct 29, 2024

AllentDan commented Oct 29, 2024

Huarong commented Oct 30, 2024

[Bug] request parameter `min_new_tokens` is not used #2678

[Bug] request parameter `min_new_tokens` is not used #2678

Huarong commented Oct 29, 2024 •

edited

Loading