UPSTREAM PR #17216: server: split HTTP into its own interface#208
UPSTREAM PR #17216: server: split HTTP into its own interface#208
Conversation
|
Access the complete analysis in the LOCI Dashboard Performance Analysis SummaryOverviewAnalysis of version Key FindingsPerformance Metrics:
Power Consumption Analysis:
Flame Graph Analysis: CFG Comparison: GitHub Code Review: Conclusion: |
0f3e62f to
a483926
Compare
3163acc to
409b78f
Compare
Mirrored from ggml-org/llama.cpp#17216
Fix ggml-org/llama.cpp#16488
How it works:
sequenceDiagram participant User participant server_http_context participant server_http_res User->>server_http_context: request server_http_context->>server_http_req: create request server_http_req->>handler: handler->>server_http_res: create response loop for each result server_http_res->>server_http_context: response chunk server_http_context->>User: response chunk server_http_context->>server_http_res: next() end server_http_res->>server_http_context: terminate server_http_context->>User: close connectionserver_res_generator, which is a derived class fromserver_http_resserver_res_generatorindicates one of 2 modes: stream or non-streamserver_res_generator::next()until it returnsfalse. Each time we callnext(), we get a new chunk of dataTODO:
server_routeslevelTesting:
tests.sh