Add prompt optimization telemetry (prompt cache hit rate tracking) to sidebar and also kevin's glm prompt#4
Closed
arihantchoudhary wants to merge 16 commits intodevfrom
Closed
Add prompt optimization telemetry (prompt cache hit rate tracking) to sidebar and also kevin's glm prompt#4arihantchoudhary wants to merge 16 commits intodevfrom
arihantchoudhary wants to merge 16 commits intodevfrom
Conversation
Complete overhaul of README to document all changes made to the project: - Added comprehensive changelog organized by date - Documented all 10 commits with detailed explanations - Included file-level changes and impact analysis - Added summary statistics (237 files, 22,700+ additions) - Documented Python SDK implementation (229 new files) - Explained PKCE authentication implementation - Detailed provider architecture refactoring - Listed all technical improvements and their business impact 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Organized all of Kevin's commits into four strategic pillars: - Idea Stack (6 commits): Architecture, infrastructure, CI/CD - TUI Features (2 commits): UI/UX improvements, metrics display - SDK Level Features (4 commits): Auth, retry logic, Python SDK - Experiment and Reporting (1 commit): Usage metrics foundation Each commit includes: - Commit ID and date - Files modified with line counts - Detailed changes description - Business/technical impact - Clear categorization for future development focus 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Complete technical documentation of all changes needed to get cache information from Cerebras API to sidebar display: Architecture Overview: - Problem: API format mismatch (Chat Completions vs Responses API) - Solution: Support both formats with nullish coalescing - Result: Cache data flows API → SDK → DB → UI 5 Changes Documented: 1. Schema update to accept both API formats (prompt_tokens & input_tokens) 2. Non-streaming parser to extract cache from both field names 3. Streaming parser to extract cache from both field names 4. Sidebar calculation logic for cache hit rate 5. Sidebar UI display with color-coded feedback Includes: - Before/after code comparisons for each change - Complete data flow diagrams - Testing procedures - Architecture diagrams - Related commits and next steps Files modified: 2 files, 5 locations Result: Zero breaking changes, backward compatible 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Changes: - Added "Commits Categorized by Strategic Pillars" section to README - Organized all 10 commits into 4 pillars: * Pillar 1: Idea Stack (6 commits) * Pillar 2: TUI Features (2 commits) * Pillar 3: SDK Level Features (4 commits) * Pillar 4: Experiment and Reporting (1 commit) - Removed COMMIT_CATEGORIZATION.md (content now in README) Result: Single source of truth in README with clearer organization 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Changes: 1. Moved "Commits Categorized by Strategic Pillars" to top of README - Now appears right after Collaborators section - Added warning banner to update with each commit 2. Created README update enforcement system: - Added .github/README_UPDATE_RULE.md with complete guidelines - Updated .husky/pre-push hook to check if README was updated - Hook prompts user if README not modified 3. Updated README structure: - Removed duplicate pillars section at bottom - Added focus descriptions for each pillar - Added this commit to Pillar 1 Result: Self-enforcing documentation system that keeps README current 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Added major cache optimizations section with detailed explanations of various improvements and changes to caching strategies. Co-authored-by: Isaac Tai <Isaac.Tai@cerebras.net>
Co-authored-by: Isaac Tai <Isaac.Tai@cerebras.net>
Co-authored-by: Isaac Tai <Isaac.Tai@cerebras.net>
Co-authored-by: Isaac Tai <Isaac.Tai@cerebras.net>
Co-authored-by: Isaac Tai <Isaac.Tai@cerebras.net>
…v1.0.140 Co-authored-by: Isaac Tai <Isaac.Tai@cerebras.net>
…ttribution - Create dedicated glm.txt prompt file for GLM 4.6 model - Update system.ts to route GLM models to the new prompt - Add Cerebras Code branding and cloud.cerebras.ai references - Auto-include Isaac Tai as co-author on all GLM-generated commits 🤖 Generated with Cerebras Code cloud.cerebras.ai Co-Authored-By: Isaac Tai <Isaac.Tai@cerebras.net>
This commit adds real-time cache hit rate monitoring to the session sidebar, displaying current and average cache efficiency with a sparkline visualization. Features: - Current cache hit rate percentage with token breakdown - Average hit rate across all requests in session - Visual sparkline graph showing trend over last 20 requests - Automatic updates after each API response - Only displays when cache data is available The sparkline uses Unicode block characters (▁▂▃▄▅▆▇█) to show whether cache efficiency is improving or declining throughout the session. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Owner
|
Can we remove some of the AI-generated markdown files? |
Owner
|
Also, since my other PR for the system prompt is open and a WIP, can we handle the prompting integration there in that branch and not try to add it to this branch? |
5a6ce89 to
04459cc
Compare
kassieclaire
added a commit
to kassieclaire/cerebras-code-cli
that referenced
this pull request
Feb 27, 2026
Phase 01: Exploration & Architecture Validation Changes: - 01-01: Added sidebar split pattern task, combined documentation, added ARCH verification - 01-02: Split from old 01-02 (keyboard infrastructure only, 3 tasks) - 01-03: NEW - keybind audit + documentation (split from old 01-02) - 01-04: Renumbered from 01-03, fixed must_haves for store-agnostic design Issues addressed: - Blocker kevint-cerebras#1: Added sidebar split pattern task to 01-01 (Success Criteria kevint-cerebras#3) - Blocker kevint-cerebras#2: Split 01-02 (5 tasks) into 01-02 (3) + 01-03 (2) - Blocker kevint-cerebras#3: Fixed must_haves/key_links for store-agnostic prototype - Warning kevint-cerebras#4: Fixed wave dependencies (01-02 wave 2, 01-03 wave 3) - Warning kevint-cerebras#5: Added RESEARCH.md to <files> for exploration tasks - Warning kevint-cerebras#6: Added test file to 01-04 files_modified - Warning kevint-cerebras#8: Added ARCH constraint verification to 01-01 and 01-04 Ready for re-verification
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds real-time prompt cache hit rate monitoring to the session sidebar with visual trend tracking.
Features added:
Implementation:
cachedInputTokensfield from assistant messages (line 1236 in openai-responses-language-model.ts)createMemofor efficient updatesExample output:
The sparkline allows users to visually track whether cache efficiency is improving or declining throughout their session, helping optimize prompt engineering for better cache utilization.
Test plan
🤖 Generated with Claude Code