Skip to content

UPSTREAM PR #18593: CUDA: disable cuda graph when using n-cpu-moe#812

Open
loci-dev wants to merge 2 commits intomainfrom
upstream-PR18593-branch_am17an-cuda-graph-disable-cpu-moe
Open

UPSTREAM PR #18593: CUDA: disable cuda graph when using n-cpu-moe#812
loci-dev wants to merge 2 commits intomainfrom
upstream-PR18593-branch_am17an-cuda-graph-disable-cpu-moe

Conversation

@loci-dev
Copy link

@loci-dev loci-dev commented Jan 4, 2026

Mirrored from ggml-org/llama.cpp#18593

Missed disabling cuda graphs when -n-cpu-moe is used

@loci-review
Copy link

loci-review bot commented Jan 4, 2026

Explore the complete analysis inside the Version Insights

Perfect! I've generated a summary report for your project. Here's what the analysis shows:

Summary Report for llama.cpp PR #812

Key Highlights:

  • Performance Impact: MINIMAL - No significant performance changes detected
  • No modified functions showed performance changes greater than 2% threshold
  • Both Response Time and Throughput Time remain stable

Recommendation: This pull request is safe to merge from a performance perspective, as it maintains performance stability without introducing any regressions.

The report compares the base version (3c2893e1-e8f1-11f0-81f2-dbb430499cb5) against the target version (7674c3c1-e973-11f0-81f2-dbb430499cb5) for the auroralabs-loci/llama.cpp repository.

@loci-dev loci-dev force-pushed the main branch 26 times, most recently from 4be6e0f to cfa0e20 Compare January 7, 2026 16:12
@loci-dev loci-dev force-pushed the main branch 30 times, most recently from 8e509d5 to 63d526f Compare January 13, 2026 23:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants