Add Test Coverage for Multiplexed Session Goroutine Leak#13529
Add Test Coverage for Multiplexed Session Goroutine Leak#13529MASA-JAPAN wants to merge 1 commit intogoogleapis:mainfrom
Conversation
Summary of ChangesHello @MASA-JAPAN, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces critical regression tests to address a previously fixed goroutine leak within the Spanner client's multiplexed session management. The tests validate that the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request adds valuable regression tests for a previously fixed goroutine leak. The tests correctly focus on verifying channel closure, which is a robust way to confirm the fix. The test descriptions and PR summary are exceptionally clear and well-documented. My review includes a couple of suggestions to improve the new tests: one to remove potentially unnecessary time.Sleep calls for a more robust and faster test, and another to refactor the second test to remove confusing and ineffective defer/recover logic. Overall, this is a great contribution that improves the project's test coverage.
| time.Sleep(50 * time.Millisecond) | ||
|
|
||
| // Close the pool | ||
| client.Close() | ||
|
|
||
| // Give some time for cleanup | ||
| time.Sleep(100 * time.Millisecond) |
There was a problem hiding this comment.
These time.Sleep calls can likely be removed to make the test faster and more robust.
- The sleep on line 2263 (
time.Sleep(50 * time.Millisecond)) seems unnecessary. The precedingselectblock already confirms that a message can be sent to the channel. The test doesn't depend on the request being fully processed, only that the channel was open, so waiting here isn't required. - The sleep on line 2269 (
time.Sleep(100 * time.Millisecond)) is redundant. The finalselectblock (lines 2273-2281) already waits for up to 100ms for the channel to be closed. By having a sleep and a timeout, the test waits longer than necessary. You can rely solely on theselectwith its timeout to handle any small delay in closing the channel afterclient.Close()returns.
Removing these sleeps would make the test more efficient without compromising its correctness.
| func() { | ||
| defer func() { | ||
| if r := recover(); r != nil { | ||
| t.Fatal("Sending to multiplexedSessionReq panicked, which means channel might not be closed properly or send was attempted incorrectly") | ||
| } | ||
| }() | ||
|
|
||
| // Try to receive from the closed channel | ||
| // A receive from a closed channel returns immediately with zero value | ||
| select { | ||
| case _, ok := <-pool.multiplexedSessionReq: | ||
| if ok { | ||
| t.Fatal("multiplexedSessionReq channel is still open after pool close") | ||
| } | ||
| // Channel is properly closed | ||
| case <-time.After(100 * time.Millisecond): | ||
| t.Fatal("multiplexedSessionReq channel is not closed after pool close") | ||
| } | ||
| }() |
There was a problem hiding this comment.
This anonymous function with defer/recover is unnecessary and its logic doesn't align with the implementation.
- The
recoveris set up to handle a panic, which typically occurs when sending to a closed channel. - However, the code inside the
selectblock receives from the channel. Receiving from a channel (whether open or closed) never panics, so therecoverblock is unreachable. - The
selectblock on its own is a correct and sufficient way to verify if the channel is closed. The surroundingfunc()wrapper anddefer/recoverlogic can be removed for clarity and correctness. This would also make the comments on lines 2318-2319, which mention panics, obsolete.
I suggest replacing this entire block with just the core select logic.
// Verify the channel is closed by attempting to receive.
// A receive from a closed channel returns immediately with a zero value and ok=false.
select {
case _, ok := <-pool.multiplexedSessionReq:
if ok {
t.Fatal("multiplexedSessionReq channel is still open after pool close")
}
// Channel is properly closed.
case <-time.After(100 * time.Millisecond):
t.Fatal("multiplexedSessionReq channel is not closed after pool close")
}
Summary
Add regression tests for issue #13396 to verify that the
multiplexedSessionReqchannel is properly closed when the session pool closes, preventing thecreateMultiplexedSessiongoroutine from leaking.Background
The Bug (Issue #13396)
When a Spanner client with multiplexed sessions enabled is closed, a background goroutine could leak if the channel it was waiting on was not properly closed.
Root Cause:
createMultiplexedSession()goroutine runs in a loop waiting for requests on themultiplexedSessionReqchannelsessionPool.close()was called, this channel was not being closedThe Fix:
A single line was added to
sessionPool.close():This allows the
for rangeloop to exit and the goroutine to terminate cleanly.Why Tests Are Needed
While the fix was implemented in commit 1805e89, no tests were added at that time. Without test coverage, this bug could silently reappear if:
Changes
Files Modified
spanner/session_test.go- Added two new test functionsTests Added
1.
TestSessionPool_CreateMultiplexedSession_NoGoroutineLeakPurpose: Verifies that the multiplexed session goroutine is properly cleaned up when the session pool closes.
How it works:
multiplexedSessionReqchannel to verify it's open and workingok == false), the test passesWhat it catches:
createMultiplexedSessiongoroutine blocking forever2.
TestSessionPool_MultiplexedSessionReqChannelClosedPurpose: A complementary test that focuses specifically on channel closure mechanics.
How it works:
Why have both tests:
Testing
Running the Tests
Test Results
Design Decisions
Why Test Channel Closure Instead of Goroutine Count?
Initially, I attempted to verify the fix by counting goroutines before and after closing the pool:
Problem: This approach was unreliable in test environments due to:
t.Parallel())Solution: Test the direct cause (channel closure) rather than the symptom (goroutine count):
for rangeloops over it exit immediatelyWhy Use
selectwith Timeout?This pattern provides:
Impact
What This Prevents
Without these tests:
With these tests:
Regression Prevention
These tests serve as executable documentation that:
Related Issues
Checklist
session_test.goclose()call)Additional Notes
This is my first contribution to this repository. I chose to add test coverage for a recent bug fix as a way to:
I'm open to any feedback or suggestions for improvement!
Testing Strategy Reference:
t.Parallel()for concurrent test executionsetupMockedTestServerWithConfig()helper for consistent test setup