Skip to content

feat: add startup cleanup and maintenance system#16628

Open
jmylchreest wants to merge 1 commit intoanomalyco:devfrom
jmylchreest:feat/cleanup-maintenance
Open

feat: add startup cleanup and maintenance system#16628
jmylchreest wants to merge 1 commit intoanomalyco:devfrom
jmylchreest:feat/cleanup-maintenance

Conversation

@jmylchreest
Copy link

@jmylchreest jmylchreest commented Mar 8, 2026

Issue for this PR

Closes #14731

Related: #16101, #12960

Type of change

  • Bug fix
  • New feature
  • Refactor / code improvement
  • Documentation

What does this PR do?

Three things:

1. Fixes the log cleanup bug (#14731)

Glob.scan returns files in non-deterministic order (depends on filesystem and path-scurry internals). The existing files.slice(0, -10) could delete the newest files instead of oldest. Fixed by adding files.sort() (ISO-timestamp filenames sort chronologically) and aligning the guard (<= 5<= maxCount) with the slice count. The maxCount is now configurable via config.

2. Adds a startup cleanup module

Runs in the background (deferred 500ms, yields every 100 deletions) to avoid blocking the TUI. It does:

  • Session retention: optionally delete sessions older than N days (opt-in via max_age_days)
  • Orphan sweep: removes storage files (session, session_diff, message, part, todo, project) that have no matching DB record — these are legacy pre-SQLite migration files that were never cleaned up
  • Snapshot sweep: removes snapshot directories for projects that no longer exist in the DB
  • Empty directory pruning: recursively removes empty dirs after sweep (they auto-recreate on next write)
  • SQLite VACUUM + WAL checkpoint: reclaims space in the DB file

All configurable via a cleanup config stanza with sensible defaults (vacuum enabled, session retention disabled/opt-in, all storage categories swept).

3. Fixes a cross-platform bug in Storage.list()

path.sep was used to split glob results, but glob always returns forward slashes regardless of OS. Changed to split("/").

In testing on a real installation, this reclaimed ~2.8GB (115k orphaned files → 3 files).

How did you verify your code works?

  • Ran bun run dev -- --print-logs --log-level DEBUG and verified cleanup logs appear with correct counts
  • Confirmed orphan sweep correctly removes files with no matching DB records
  • Confirmed empty directory pruning works recursively
  • Confirmed vacuum completes in <10ms
  • Confirmed TUI is not blocked (cleanup deferred + yielding)
  • CI: typecheck passes, unit tests pass (Linux), nix-eval passes

Screenshots / recordings

INFO  2026-03-08T19:44:39 +450ms service=cleanup cleanup started
INFO  2026-03-08T19:44:39 +11ms  service=cleanup duration=3 vacuum complete
INFO  2026-03-08T19:44:39 +0ms   service=cleanup sessions_deleted=0 orphans_swept=0 cleanup complete

Before/after on a real install:

Before: 1.7GB storage (115k files), 1.1GB snapshots (16 dirs)
After:  12K storage (3 files), 0 snapshots

Checklist

  • I have tested my changes locally
  • I have not included unrelated changes in this PR

Add configurable cleanup module that runs on startup to manage database
size and remove orphaned storage files. Fixes log cleanup bug where
non-deterministic glob ordering could delete newest files instead of
oldest.

Changes:
- New cleanup config stanza with session retention, storage sweep, and vacuum options
- Fix log cleanup: sort by filename before slicing, align guard with slice count
- Sweep orphaned storage files (session, message, part, todo, session_diff, project, snapshot)
- Prune empty directories after sweep
- Run SQLite VACUUM and WAL checkpoint on startup
- Fix pre-existing path.sep bug in Storage.list() for cross-platform correctness
- Defer cleanup 500ms and yield every 100 deletions to avoid blocking TUI
@github-actions github-actions bot added the needs:compliance This means the issue will auto-close after 2 hours. label Mar 8, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Mar 8, 2026

The following comment was made by an LLM, it may be inaccurate:

Based on my search, I found several related PRs that address similar concerns:

Potential duplicates/related PRs:

  1. PR feat(session): add lifecycle management — storage reclamation, CLI commands, VACUUM support #16201 - feat(session): add lifecycle management — storage reclamation, CLI commands, VACUUM support

    • Directly related: covers session lifecycle management, storage reclamation, and VACUUM support—core features of the current PR
  2. PR fix(opencode): unbounded memory growth during active usage #16346 - fix(opencode): unbounded memory growth during active usage

    • Related: addresses similar storage/memory growth concerns that cleanup is meant to solve
  3. PR feat: bugfix to snapshot pruning and allow snapshot config to accept positive integer for retention lifespan in days (resolves #10626, #10782, #6845, #3182, #10532, #8577) #12856 - feat: bugfix to snapshot pruning and allow snapshot config to accept positive integer for retention lifespan in days

    • Related: addresses snapshot/session retention configuration similar to the session cleanup aspect
  4. PR fix(log): sort files before slicing to delete oldest logs first #14792 - fix(log): sort files before slicing to delete oldest logs first

    • Related: addresses the same log cleanup sorting bug mentioned in this PR's description
  5. PR fix(log): await cleanup and sort files before deletion #7245 - fix(log): await cleanup and sort files before deletion

    • Related: earlier attempt at log cleanup and sorting

Note: PR #16201 appears to be the most closely related—it covers VACUUM support and storage reclamation which are core features of PR #16628. You may want to verify whether this PR supersedes, complements, or duplicates that work.

@jmylchreest
Copy link
Author

Related Issues & PRs

Disclaimer: These were found via search by an AI assistant and may not be exhaustive or perfectly categorized.

Directly addressed by this PR

# Title Relationship
#14731 Log cleanup deletes newest log files instead of oldest Fixedfiles.sort() + guard/slice alignment
#12960 feat: support configurable log rotation strategies Partially addressed — configurable max_count, though not full rotation strategies
#16101 Session Lifecycle Management — unified storage reclamation Partially addressed — session retention by age, orphan sweep, vacuum

Related PRs (overlapping scope)

# Title Overlap
#14792 fix(log): sort files before slicing to delete oldest logs first Same log cleanup fix — this PR is more comprehensive
#7245 fix(log): await cleanup and sort files before deletion Same log cleanup fix
#6641 fix: prevent log file deletion when using --log-level DEBUG Related log cleanup issue
#16201 feat(session): add lifecycle management — storage reclamation, CLI commands, VACUUM support Significant overlap — session cleanup + vacuum
#12631 feat(cli): add session prune command for cleanup Session pruning (CLI-based vs our startup-based)
#12966 feat(opencode): add pluggable log rotation strategies Log rotation

Related issues (symptoms this PR helps with)

# Title How this PR helps
#13479 Large number of tmp files taking up disk space Orphan sweep + empty dir pruning
#8096 C drive crashed after 10 minutes Storage/disk reclamation on startup
#12687 Severe Memory Leak and Disk Swell leading to Kernel Panic Storage reclamation helps with disk component
#7607 If disk space runs out your active session becomes corrupted Proactive cleanup reduces disk pressure
#10034 tmp_pack Leak - Complete Code Analysis Snapshot orphan cleanup
#14648 Worktree bootstrap failures leak orphaned directories Related disk cleanup concern
#7733 Storage resilience: atomic writes, safer temp cleanup Related storage cleanup

@github-actions github-actions bot removed the needs:compliance This means the issue will auto-close after 2 hours. label Mar 8, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Mar 8, 2026

Thanks for updating your PR! It now meets our contributing guidelines. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Log cleanup deletes newest log files instead of oldest (path-scurry reverse ordering)

1 participant