-
Notifications
You must be signed in to change notification settings - Fork 76
feat(skills): add screen-recording awareness skill #8011
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,62 @@ | ||
| --- | ||
| name: "Screen Recording" | ||
| description: "Capture screen recordings during computer-use sessions. Records the display or a specific window as H.264 MP4 video with optional audio." | ||
| user-invocable: false | ||
| disable-model-invocation: false | ||
| metadata: | ||
| vellum: | ||
| emoji: "π₯" | ||
| os: ["darwin"] | ||
|
Comment on lines
+6
to
+9
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
The skill loader only parses Useful? React with πΒ / π. |
||
| --- | ||
|
|
||
| # Screen Recording | ||
|
|
||
| You have access to a screen recording capability that captures what happens on the user's Mac during computer-use sessions. | ||
|
|
||
| ## How It Works | ||
|
|
||
| - **Automatic in QA mode**: When a QA/test session starts, recording begins automatically before any destructive actions (clicks, typing, etc.) | ||
| - **Can be requested explicitly**: Sessions can be configured with `requiresRecording: true` to enable recording outside of QA mode | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This instruction says Useful? React with πΒ / π. |
||
| - **Recording gate**: When recording is required, destructive actions are blocked until the first video frame is confirmed captured | ||
|
|
||
| ## Recording Details | ||
|
|
||
| - **Format**: H.264 MP4 video at 30 fps | ||
| - **Resolution**: 1920Γ1080 | ||
| - **Bitrate**: 4 Mbps video, 128 kbps AAC audio (when audio is enabled) | ||
| - **Capture scope**: Either the full display or a specific window | ||
| - **Storage**: Files are saved to `~/Library/Application Support/vellum-assistant/recordings/` | ||
| - **Naming**: `qa-recording-{timestamp}.mp4` | ||
|
|
||
| ## Health Checks | ||
|
|
||
| A first-frame handshake verifies the capture pipeline is healthy within 5 seconds of starting. If no frames arrive: | ||
| - **Required recording**: The session fails immediately with a clear error | ||
| - **Optional recording**: A warning is shown but the session continues | ||
|
|
||
| ## After Recording | ||
|
|
||
| When a recording completes: | ||
| 1. The video file is saved to disk | ||
| 2. A file-backed attachment is created (metadata in DB, file stays on disk) | ||
| 3. The attachment is linked to the originating chat message | ||
| 4. The video appears inline in the conversation for playback | ||
|
|
||
| ## Retention | ||
|
|
||
| Recordings have an expiration timestamp (default: 7 days, configurable). An automatic cleanup worker runs periodically (every 6 hours) and deletes expired recording files from disk. | ||
|
|
||
| ## Limitations | ||
|
|
||
| - Requires Screen Recording permission in System Settings > Privacy & Security | ||
| - Only available on macOS | ||
| - One recording at a time per session | ||
| - Recording must be stopped explicitly (or stops when the session ends) | ||
| - Large recordings consume disk space until cleanup runs | ||
|
|
||
| ## When to Mention Recording | ||
|
|
||
| - Tell users their QA session is being recorded when relevant | ||
| - Offer to analyze recordings using the media-processing skill (keyframe extraction, event detection) | ||
| - Mention retention period if users ask about storage or cleanup | ||
| - If recording fails, explain the specific error (permission denied, no display found, etc.) | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
π΄ Multi-line YAML metadata not parsed by line-based frontmatter parser, losing os constraint and emoji
The
metadatafield in the new SKILL.md uses multi-line nested YAML syntax, but the frontmatter parser (assistant/src/skills/frontmatter.ts:28-62) is a simple line-by-line key-value parser that splits on:. It does not support multi-line YAML or nested structures.Root Cause and Impact
When the parser encounters:
It parses each line independently, producing:
metadataβ""(empty string)vellumβ""(empty string)emojiβ"π₯"osβ["darwin"]Since
metadatais an empty string, the check atassistant/src/config/skills.ts:309(if (metadataRaw)) evaluates tofalseand skips JSON parsing entirely. Themetadatafield on the skill will beundefined.This causes two problems:
os: ["darwin"]constraint is lost β the skill will appear available on all platforms instead of only macOS. The OS eligibility check atassistant/src/config/skills.ts:149-158will never fire becausevellum.osis undefined.π₯is lost β the skill won't display its intended icon.Every other skill in the codebase uses single-line JSON for metadata (e.g.,
metadata: {"vellum": {"emoji": "π", "os": ["darwin"]}}inassistant/src/config/bundled-skills/macos-automation/SKILL.md).Was this helpful? React with π or π to provide feedback.