-
Notifications
You must be signed in to change notification settings - Fork 2.3k
feat/fix Re-enabled WAL with commit transaction management (Linux Verification Requested) #5793
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Vincent Huang <[email protected]>
Signed-off-by: Vincent Huang <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR re-enables SQLite WAL (Write-Ahead Logging) mode with explicit transaction management to fix a race condition where get_session could read stale data immediately after create_session. The solution adds explicit tx.commit() calls to ensure WAL changes are visible to concurrent readers.
Key changes:
- Enabled WAL journal mode in SQLite connection options
- Added explicit transaction management with commit calls to write operations
- Ensures WAL checkpoints occur at transaction boundaries for consistent reads
| tx.commit().await?; | ||
|
|
||
| if let Some(conversation) = &session.conversation { | ||
| self.replace_conversation(&session.id, conversation).await?; | ||
| } |
Copilot
AI
Nov 18, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The transaction commits before calling replace_conversation, breaking atomicity of the import. If replace_conversation fails, the session will be imported without messages. Consider including the conversation import in the same transaction, or document why this split is intentional.
| tx.commit().await?; | |
| if let Some(conversation) = &session.conversation { | |
| self.replace_conversation(&session.id, conversation).await?; | |
| } | |
| if let Some(conversation) = &session.conversation { | |
| self.replace_conversation_tx(&session.id, conversation, &mut tx).await?; | |
| } | |
| tx.commit().await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was on the fence with changing this, replace_conversation already does it's own commit independent of import_session. Easiest solution would be to unwrap replace_conversation inside of import_session.
If we move towards foreign keys then messages should not be session orphaned by default and can be addressed there.
jamadeo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…xt-test * 'main' of github.com:block/goose: chore: Add Adrian Cole to Maintainers (#5815) [MCP-UI] Proxy and Better Message Handling (#5487) Release 1.15.0 Document New Window menu in macOS dock (#5811) Catch cron errors (#5707) feat/fix Re-enabled WAL with commit transaction management (Linux Verification Requested) (#5793) chore: remove autopilot experimental feature (#5781) Read paths from an interactive & login shell (#5774) docs: acp clients (#5800)
* main: feat/fix Re-enabled WAL with commit transaction management (Linux Verification Requested) (#5793) chore: remove autopilot experimental feature (#5781) Read paths from an interactive & login shell (#5774) docs: acp clients (#5800) Provider error proxy for simulating various types of errors (#5091) chore: Add links to maintainer profiles (#5788) Quick fix for community all stars script (#5798) Document Mistral AI provider (#5799) docs: Add Community Stars recipe script and txt file (#5776)
* main: (33 commits) fix: support Gemini 3's thought signatures (#5806) chore: Add Adrian Cole to Maintainers (#5815) [MCP-UI] Proxy and Better Message Handling (#5487) Release 1.15.0 Document New Window menu in macOS dock (#5811) Catch cron errors (#5707) feat/fix Re-enabled WAL with commit transaction management (Linux Verification Requested) (#5793) chore: remove autopilot experimental feature (#5781) Read paths from an interactive & login shell (#5774) docs: acp clients (#5800) Provider error proxy for simulating various types of errors (#5091) chore: Add links to maintainer profiles (#5788) Quick fix for community all stars script (#5798) Document Mistral AI provider (#5799) docs: Add Community Stars recipe script and txt file (#5776) chore: incorporate LF feedback (#5787) docs: quick launcher (#5779) Bump auto scroll threshold (#5738) fix: add one-time cleanup for linux hermit locking issues (#5742) Don't show update tray icon if GOOSE_VERSION is set (#5750) ...
…ification Requested) (block#5793) Signed-off-by: Vincent Huang <[email protected]> Signed-off-by: Blair Allan <[email protected]>
Summary
There seems to be a WAL race condition within the
builder.rsfile where a session is created and get session is called immediately after, which in some instances can fail because the get_session call is looking at an old version of the sessions db + WAL from one thread before the create_session changes finishes propagating.I believe the culprit to be the lack of an explicit
committransaction with WAL enabled. In the concurrency section of sqlite documentation,When a read operation begins on a WAL-mode database, it first remembers the location of the last valid commit record in the WAL. So even though we were relying on concurrency throughawait?;create_session never explicitly called commit, which possibly resulted inget_session"misses" on an old version of the database.This could also explain why the Pragma wal_checkpoint approach didn't work as the checkpoint didn't have a completed commit to apply WAL file changes to the database.
Type of Change
AI Assistance
Testing
I was unable to reproduce on multiple linux docker images, I went ahead and reproduced the "bug" (
create_sessionandget_sessionrace condition) by writing concurrent create_session -> get_session race condition tests and ran them a few thousand times.Ran a timing check on an existing concurrency test.
time cargo test test_concurrent_session_creation --release -- --test-threads=1;Eyeballing the results, WAL takes about
.763sand without takes.848s, about a 10% improvement.Related Issues
Relates to #5197
Discussion: