Skip to content

[BCI] QVAC-17071 feat: add BCI neural signal support (variable conv1 kernel + windowed attention)#9

Closed
sharmaraju352 wants to merge 2 commits into
masterfrom
feat/bci-patches
Closed

[BCI] QVAC-17071 feat: add BCI neural signal support (variable conv1 kernel + windowed attention)#9
sharmaraju352 wants to merge 2 commits into
masterfrom
feat/bci-patches

Conversation

@sharmaraju352

Copy link
Copy Markdown

Summary

Adds two changes to whisper.cpp to support brain-computer interface (BCI) neural signal transcription:

1. Variable conv1 kernel size

  • Reads n_audio_conv1_kernel from model hparams (defaults to 3 for standard whisper models)
  • Allows BCI models to use a different first convolution kernel size

2. Windowed self-attention for encoder layers

  • Adds n_audio_window_size and n_audio_last_window_layer hparams
  • When present, encoder self-attention is restricted to a local window for layers up to last_window_layer
  • Adds proper SOS token (language + transcribe) initialization for BCI models
  • Both flash-attention and standard attention paths are updated

Backward compatibility

Both changes are backward-compatible:

  • n_audio_conv1_kernel defaults to 3 (standard whisper behavior)
  • n_audio_window_size defaults to 0 and n_audio_last_window_layer defaults to -1, which disables windowed attention entirely
  • Standard whisper models are unaffected

Context

These changes are required by the new @qvac/bci-whispercpp addon: tetherto/qvac#1583

Test plan

  • Standard whisper transcription still works (no regression)
  • BCI model loads and transcribes neural signals correctly
  • Verified locally: 10.4% average WER across 5 BCI test samples

Made with Cursor

@sharmaraju352 sharmaraju352 requested review from a team as code owners April 16, 2026 09:44
Read n_audio_conv1_kernel from model hparams to allow BCI models
to use a non-standard first convolution kernel size. Standard
whisper models default to kernel size 3.

Made-with: Cursor
Add windowed attention mask support controlled by two new model
hparams: n_audio_window_size and n_audio_last_window_layer. When
present in the model, encoder self-attention is restricted to a
local window for layers up to last_window_layer, improving BCI
neural signal transcription quality.

Also adds proper SOS token (language + transcribe) initialization
when the model uses windowed attention, matching the behavior
expected by BCI-trained models.

Made-with: Cursor
@sharmaraju352

Copy link
Copy Markdown
Author

Closing in favor of a new PR based on v1.8.4.1 with flash attention fix.

@gianni-cor gianni-cor deleted the feat/bci-patches branch May 28, 2026 13:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant