UPSTREAM PR #17357: chat: fix int overflow, prevent size calculation in float/double by DajanaV · Pull Request #257 · auroralabs-loci/llama.cpp

DajanaV · 2025-11-18T11:34:30Z

Hmm on second thought, yes it can be better to use size_t, as even when overflow it won't be a negative.

But please note that overflow can still potentially happen.

Also, I didn't notice that * 1.25 will convert it to double, so we should use integer division instead.

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

loci-review · 2025-11-18T12:15:36Z

Access the complete analysis in the LOCI Dashboard

Performance Analysis Summary

Overview

Pull Request #257 implements integer overflow fixes and arithmetic optimizations in the chat template system (common_chat_templates_apply_legacy function). The changes include converting int to size_t for buffer allocation, replacing floating-point multiplication with integer arithmetic, and adding safety validation checks.

Key Findings

Performance Regression Identified:

Highest Response Time Change: std::vector<std::sub_match>::begin() shows +210.8% increase (84 ns → 260 ns, +176 ns absolute)
Highest Throughput Change: std::unordered_set<std::string>::begin() shows +303.0% increase (60 ns → 242 ns, +182 ns absolute)

Core Function Impact Assessment:
The performance regressions occur in STL container iterator functions, not in LLaMA.cpp core inference functions (llama_decode, llama_encode, llama_tokenize). These functions are not part of the critical tokenization/inference pipeline, therefore tokens per second performance remains unaffected.

Power Consumption Analysis:

build.bin.llama-tts: +3.97% increase (2833 nJ → 2945 nJ)
build.bin.llama-run: -5.56% decrease (2659 nJ → 2511 nJ)
build.bin.llama-cvector-generator: -1.23% decrease (3593 nJ → 3549 nJ)

Root Cause Analysis:
CFG comparison reveals the performance regression stems from compiler optimization changes, not the PR code modifications. The new version introduces an unnecessary unconditional branch in the function prologue, fragmenting instruction cache locality and disrupting pipeline efficiency.

Code Quality Assessment:
The PR implements essential correctness improvements:

Prevents integer overflow in buffer allocation
Eliminates floating-point precision issues
Adds defensive validation checks

Actionable Recommendations:

Investigate compiler configuration - Review optimization flags and compiler versions between builds
Consider profile-guided optimization - Address the unconditional branch overhead in container operations
Verify build consistency - Ensure identical compilation environments across versions

The code changes represent sound engineering practices and should not cause performance regressions. The observed issues appear related to build system or compiler optimization differences rather than the implemented fixes.

ngxson and others added 2 commits November 18, 2025 11:47

chat: fix int overflow, prevent size calculation in float/double

b33603e

Update common/chat.cpp

6d3d572

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

DajanaV temporarily deployed to PROD__AL_DEMO November 18, 2025 11:34 — with GitHub Actions Inactive

loci-dev force-pushed the main branch 26 times, most recently from d5eb05c to 819d372 Compare November 24, 2025 19:07

loci-dev force-pushed the main branch 30 times, most recently from cbd9848 to 77b6fa0 Compare December 1, 2025 12:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UPSTREAM PR #17357: chat: fix int overflow, prevent size calculation in float/double#257

UPSTREAM PR #17357: chat: fix int overflow, prevent size calculation in float/double#257
DajanaV wants to merge 2 commits intomainfrom
upstream-PR17357-branch_ngxson-xsn/chat_int_overflow

DajanaV commented Nov 18, 2025

Uh oh!

loci-review bot commented Nov 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

DajanaV commented Nov 18, 2025

Uh oh!

loci-review bot commented Nov 18, 2025

Performance Analysis Summary

Overview

Key Findings

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants