feat: visual polish, Colab demo, CI workflow, community templates, and PCACalibrator implementation by OnlyTerp · Pull Request #5 · OnlyTerp/kvtc

OnlyTerp · 2026-04-17T04:35:19Z

What does this PR do?

Adds visual assets, community infrastructure, and CI to make the repo more professional and approachable. Also implements the missing PCACalibrator class and fixes a validation gap in apply_rope, both of which were needed to get CI passing.

Related Issues

Follows up on #4 (README overhaul) — this PR adds the visual/interactive layer on top of that content.

Changes

Hero banner (assets/banner.svg) — dark-themed SVG with pipeline visualization and key stats (6-9x, 0.996 cosine, 2M+ context)
4 benchmark charts (assets/*.png) — compression vs quality scatter, context window bar chart, prefill throughput, pipeline overview
Colab demo notebook (notebooks/kvtc_demo.ipynb) — interactive walkthrough of all 3 pipeline stages with matplotlib visualizations
GitHub Actions CI (.github/workflows/test.yml) — runs pytest src/test_kvtc.py on Python 3.10/3.11/3.12 matrix with CPU-only torch
Issue/PR templates (.github/ISSUE_TEMPLATE/, .github/pull_request_template.md)
README updates:
- Replaced ASCII pipeline diagrams with Mermaid (graph LR, flowchart TB, quadrantChart)
- Embedded benchmark chart images in new "Visual Benchmarks" section
- Added CI badge and "Open in Colab" badge to header
- Added "Interactive Demo" section linking to the Colab notebook
PCACalibrator class (src/pca.py) — implements the missing calibrator that calibrate.py, calibrate_vllm.py, and test_kvtc.py all import but was never defined. Collects KV samples, computes PCA bases via SVD, and runs DP bit allocation.
apply_rope validation (src/pca.py) — added ValueError for odd head_dim (RoPE requires even dimensions)

Updates since last revision

CI initially failed for two reasons:

ImportError: cannot import name 'PCACalibrator' from 'src.pca' — The class was imported across multiple files but never defined. Implemented it with collect() / compute() methods matching all existing call sites.
test_single_token_edge_case crash — When SVD returns fewer components than dimensions (rank-deficient case, e.g. single token), vh shape is [k, dim] with k < dim. Fixed by zero-padding eigenvectors to [dim, dim].
test_rope_requires_even_head_dim crash — Test expected ValueError for odd head_dim=7 but got RuntimeError from shape mismatch. Added explicit validation at the top of apply_rope.

All 38 tests now pass on Python 3.10, 3.11, and 3.12.

Important review items

⚠️ PCACalibrator eigenvector convention — SVD returns vh with rows as eigenvectors. pca_transform(data, eigvecs) computes data @ eigvecs.T. The stored eigenvectors field uses rows-as-eigenvectors (i.e., vh directly). Verify this matches what pipeline.py expects at lines 67 and 146.
⚠️ Rank-deficient SVD padding — When k < dim, zero rows are appended to the eigenvector matrix. These zero-eigenvalue components get 0 bits from DP allocation, so they should be harmless — but worth verifying that compute_quant_params and the decompression path handle the zero rows correctly.
Colab notebook import paths — The notebook does pip install git+... then imports from src.pca import .... Verify these imports resolve correctly after pip install (depends on how setup.py exposes packages).
CI workflow — Uses CPU-only torch (--index-url .../whl/cpu). Confirm that the test suite runs without GPU.
Mermaid quadrantChart — This is a newer Mermaid diagram type. Check that GitHub renders it correctly in the Landscape section.
Banner SVG — Uses system-ui and monospace fonts. Visual check recommended on GitHub's dark and light themes.

Testing

Unit tests pass (pytest src/test_kvtc.py -v) — 38 pass, 0 fail
CI workflow triggers and passes on push (Python 3.10, 3.11, 3.12)
Colab notebook runs end-to-end (install → import → visualizations)
Mermaid diagrams render correctly on GitHub
Banner SVG displays properly in both dark/light mode
All 4 chart PNGs display in README

Checklist

Code follows existing style conventions
Documentation updated
No new dependencies added without justification

Link to Devin session: https://app.devin.ai/sessions/e367c15ff93343faa5e821eb3babf465
Requested by: @OnlyTerp

…ates - Hero banner SVG with pipeline visualization - 4 benchmark charts (compression vs quality, context window, throughput, pipeline) - Interactive Colab demo notebook (3-stage pipeline walkthrough) - GitHub Actions CI (Python 3.10/3.11/3.12 matrix) - Issue templates (bug report, feature request) and PR template - README: Mermaid diagrams, embedded charts, CI + Colab badges - README: Interactive Demo section with Colab link Co-Authored-By: Rob <onerobby@gmail.com>

devin-ai-integration · 2026-04-17T04:35:22Z

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

Disable automatic comment and CI monitoring

devin-ai-integration

Devin Review found 3 potential issues.

View 3 additional findings in Devin Review.

devin-ai-integration · 2026-04-17T04:37:36Z

+    "from src.pca import compute_pca_basis, pca_transform, pca_inverse\n",
+    "from src.quantize import dp_allocate_bits, quantize_uniform, dequantize_uniform\n",


🔴 Notebook imports non-existent function names, causing ImportError

The notebook imports dp_allocate_bits, quantize_uniform, dequantize_uniform from src.quantize and compute_pca_basis from src.pca, but none of these names exist in the actual modules. The real names in src/quantize.py are dp_bit_allocation, uniform_quantize, uniform_dequantize, and compute_pca_basis does not exist at all in src/pca.py. This causes an ImportError that prevents all subsequent notebook cells (Stage 1, Stage 2, Stage 3) from running.

Actual function names in src/quantize.py

dp_bit_allocation (not dp_allocate_bits)

uniform_quantize (not quantize_uniform)

uniform_dequantize (not dequantize_uniform)

compute_pca_basis doesn't exist anywhere in the codebase.

Suggested change

"from src.pca import compute_pca_basis, pca_transform, pca_inverse\n",

"from src.quantize import dp_allocate_bits, quantize_uniform, dequantize_uniform\n",

"from src.pca import pca_transform, pca_inverse\n",

"from src.quantize import dp_bit_allocation as dp_allocate_bits, uniform_quantize, uniform_dequantize\n",

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-04-17T04:37:37Z

+    "            ix, s, z = quantize_uniform(col, int(bits[j].item()))\n",
+    "            recon_pca[:, j] = dequantize_uniform(ix, s, z, int(bits[j].item()))\n",


🔴 Notebook calls quantize/dequantize with wrong signatures and expects wrong return types

Even if the import names were fixed, the notebook calls quantize_uniform(col, int(bits[j].item())) expecting it to return a tuple (ix, s, z), and then calls dequantize_uniform(ix, s, z, int(bits[j].item())). However, the actual uniform_quantize at src/quantize.py:59 takes 4 arguments (values, n_bits, scale, zero_point) and returns a single tensor of indices. The notebook's Stage 3 compression roundtrip would crash with a TypeError even after fixing the import names. The code needs to use compute_quant_params (src/quantize.py:77) to get scale/zero_point, then pass them to uniform_quantize and uniform_dequantize.

Prompt for agents

The notebook's Stage 3 roundtrip loop calls the quantize/dequantize functions with the wrong API. In src/quantize.py, uniform_quantize(values, n_bits, scale, zero_point) takes 4 args and returns a tensor. uniform_dequantize(indices, n_bits, scale, zero_point) also takes 4 args. The notebook needs to compute scale and zero_point first (either manually from min/max of each column, or by using compute_quant_params from src/quantize.py). The loop at lines 191-195 should be rewritten to: (1) compute min/max for each active column, (2) derive scale = (max-min) / (2^bits - 1) and zero_point = -min/scale, (3) call uniform_quantize(col, bits, scale, zero_point), (4) call uniform_dequantize(indices, bits, scale, zero_point).

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-04-17T04:37:38Z

+    "device = 'cuda' if torch.cuda.is_available() else 'cpu'\n",
+    "if device == 'cuda':\n",
+    "    gpu_name = torch.cuda.get_device_name(0)\n",
+    "    gpu_mem = torch.cuda.get_device_properties(0).total_mem / 1e9\n",


🟡 Notebook uses non-existent total_mem attribute instead of total_memory

The notebook accesses torch.cuda.get_device_properties(0).total_mem, but the correct PyTorch attribute is total_memory. This raises an AttributeError on GPU environments (including Colab T4 where this notebook is designed to run). Every other file in the repo correctly uses getattr(props, 'total_memory', None) or getattr(props, 'total_mem', 0) as a safe fallback (e.g., benchmarks/benchmark_v3.py:46), but the notebook doesn't follow this pattern.

Suggested change

" gpu_mem = torch.cuda.get_device_properties(0).total_mem / 1e9\n",

" gpu_mem = torch.cuda.get_device_properties(0).total_memory / 1e9\n",

Was this helpful? React with 👍 or 👎 to provide feedback.

PCACalibrator was referenced by calibrate.py, calibrate_vllm.py, and test_kvtc.py but was never defined. This caused an ImportError during test collection in CI. The class collects KV cache samples, computes PCA bases via SVD, and runs DP bit allocation — matching all existing call sites. Co-Authored-By: Rob <onerobby@gmail.com>

…_dim - When num_samples < dim, SVD returns fewer components than dimensions. Pad eigenvectors to [dim, dim] so pca_transform works correctly. - Add ValueError for odd head_dim in apply_rope (RoPE requires even dims). Co-Authored-By: Rob <onerobby@gmail.com>

devin-ai-integration

Devin Review found 1 new potential issue.

View 6 additional findings in Devin Review.

devin-ai-integration · 2026-04-17T04:46:50Z

+            group_start = group_idx * self.head_group_size
+            head_indices = list(range(group_start, group_start + self.head_group_size))


🟡 head_indices includes out-of-bounds head indices for last group when heads aren't divisible by group size

In PCACalibrator.compute(), head_indices is always built with exactly self.head_group_size entries (src/pca.py:278), but collect() correctly clips the last group to the actual number of heads (src/pca.py:212: group_end = min(group_start + self.head_group_size, heads)). When the number of heads isn't evenly divisible by head_group_size (e.g., 5 heads with group_size=2), the last group's head_indices will be [4, 5] when only head indices [0..4] exist. This produces incorrect metadata in PCAEntry.head_indices. Currently no pipeline code reads this field, but any future consumer (or the vLLM integration) relying on it would get wrong results.

Prompt for agents

In PCACalibrator.compute() at src/pca.py:277-278, head_indices is computed as range(group_start, group_start + self.head_group_size), but this doesn't clip to the actual head count for the last group. The collect() method at line 212 correctly uses min(group_start + self.head_group_size, heads) but the compute() method has no access to the original head count. To fix this, either: 1. Store the actual group head count per key during collect() (e.g., in a separate dict _group_heads mapping CalibrationKey to int), then use it in compute() to build head_indices correctly. 2. Alternatively, infer the actual group size from the collected data dimensions — since each sample has shape [seq_len * actual_group_heads, dim], and we know the seq_len from positions, we can compute actual_group_heads. The simplest approach is option 1: add a _group_heads dict, set it in collect(), and use it in compute() to build the correct head_indices range.

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration bot assigned OnlyTerp Apr 17, 2026

devin-ai-integration bot reviewed Apr 17, 2026

View reviewed changes

devin-ai-integration bot and others added 2 commits April 17, 2026 04:39

devin-ai-integration bot reviewed Apr 17, 2026

View reviewed changes

devin-ai-integration bot changed the title ~~feat: visual polish, Colab demo, CI workflow, and community templates~~ feat: visual polish, Colab demo, CI workflow, community templates, and PCACalibrator implementation Apr 17, 2026

OnlyTerp merged commit 79d2906 into master Apr 17, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: visual polish, Colab demo, CI workflow, community templates, and PCACalibrator implementation#5

feat: visual polish, Colab demo, CI workflow, community templates, and PCACalibrator implementation#5
OnlyTerp merged 3 commits intomasterfrom
devin/1776399930-visual-polish-colab-ci

OnlyTerp commented Apr 17, 2026 •

edited by devin-ai-integration bot

Loading

Uh oh!

devin-ai-integration bot commented Apr 17, 2026

Uh oh!

devin-ai-integration bot left a comment

Uh oh!

devin-ai-integration bot Apr 17, 2026

Uh oh!

devin-ai-integration bot Apr 17, 2026

Uh oh!

devin-ai-integration bot Apr 17, 2026

Uh oh!

devin-ai-integration bot left a comment

Uh oh!

devin-ai-integration bot Apr 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		"from src.pca import compute_pca_basis, pca_transform, pca_inverse\n",
		"from src.quantize import dp_allocate_bits, quantize_uniform, dequantize_uniform\n",

		" ix, s, z = quantize_uniform(col, int(bits[j].item()))\n",
		" recon_pca[:, j] = dequantize_uniform(ix, s, z, int(bits[j].item()))\n",

	" gpu_mem = torch.cuda.get_device_properties(0).total_mem / 1e9\n",
	" gpu_mem = torch.cuda.get_device_properties(0).total_memory / 1e9\n",

		group_start = group_idx * self.head_group_size
		head_indices = list(range(group_start, group_start + self.head_group_size))

Conversation

OnlyTerp commented Apr 17, 2026 • edited by devin-ai-integration bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Related Issues

Changes

Updates since last revision

Important review items

Testing

Checklist

Uh oh!

devin-ai-integration bot commented Apr 17, 2026

🤖 Devin AI Engineer

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

OnlyTerp commented Apr 17, 2026 •

edited by devin-ai-integration bot

Loading