-
Notifications
You must be signed in to change notification settings - Fork 323
Open
Description
System Info
docker run -it --rm -p 7997:7997 --gpus all ghcr.io/huggingface/text-embeddings-inference:hopper-1.6
--model-id BAAI/bge-large-en-v1.5 --port 7997 --max-client-batch-size 8000 --max-batch-size 64 --max-concurrent-requests 512
revision: None, tokenization_workers: Some(8), dtype: None, pooling: None, max_concurrent_requests: 512, max_batch_tokens: 16384, max_batch_requests: Some(32), max_client_batch_size: 8000, auto_truncate: true, default_prompt_name: None, default_prompt: None, hf_api_token: None, hostname: "xxx", port: 7997, uds_path: "/tmp/text-embeddings-inference-server", huggingface_hub_cache: Some("/data"), payload_limit: 2000000, api_key: None, json_output: false, otlp_endpoint: None, otlp_service_name: "text-embeddings-inference.server", cors_allow_origin: None }
2025-01-13T22:43:09.523081Z INFO openai_embed{total_time="3.53211965s" tokenization_time="1.65615ms" queue_time="504.499706ms" inference_time="33.584808ms"}: text_embeddings_router::http::server: router/src/http/server.rs:1260: Success
2025-01-13T22:43:09.792384Z INFO openai_embed{total_time="3.801234069s" tokenization_time="1.672146ms" queue_time="503.120879ms" inference_time="33.486594ms"}: text_embeddings_router::http::server: router/src/http/server.rs:1260: Success
2025-01-13T22:43:10.033664Z INFO openai_embed{total_time="4.041983871s" tokenization_time="1.643926ms" queue_time="504.332069ms" inference_time="33.626412ms"}: text_embeddings_router::http::server: router/src/http/server.rs:1260: Success
2025-01-13T22:43:20.890172Z INFO embed{total_time="128.536668ms" tokenization_time="2.85406ms" queue_time="58.577982ms" inference_time="9.716932ms"}: text_embeddings_router::http::server: router/src/http/server.rs:714: Success
thread 'tokio-runtime-worker' panicked at core/src/queue.rs:72:14:
Queue background task dropped the receiver or the receiver is too behind. This is a bug.: "Full(..)"
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Workload:
- switching between 1min break, 1 validation request (using
/embed) 2 min throughput testing benchmark (/openai_embed`)
Result:
- queue.Full error raised here:
pub fn append(&self, entry: Entry) {
// Send append command to the background task managing the state
// Unwrap is safe here
self.queue_sender
.try_send(QueueCommand::Append(Box::new(entry), Span::current()))
.expect("Queue background task dropped the receiver or the receiver is too behind. This is a bug.");
}
Information
- Docker
- The CLI directly
Tasks
- An officially supported command
- My own modifications
Reproduction
Expected behavior
eli-persona and cdpiersecdpierse
Metadata
Metadata
Assignees
Labels
No labels