Add Test Coverage for Multiplexed Session Goroutine Leak by MASA-JAPAN · Pull Request #13529 · googleapis/google-cloud-go

MASA-JAPAN · 2026-01-02T03:19:52Z

Summary

Add regression tests for issue #13396 to verify that the multiplexedSessionReq channel is properly closed when the session pool closes, preventing the createMultiplexedSession goroutine from leaking.

Background

The Bug (Issue #13396)

When a Spanner client with multiplexed sessions enabled is closed, a background goroutine could leak if the channel it was waiting on was not properly closed.

Root Cause:

The createMultiplexedSession() goroutine runs in a loop waiting for requests on the multiplexedSessionReq channel
When sessionPool.close() was called, this channel was not being closed
The goroutine remained blocked forever, waiting for messages that would never come
This resulted in a goroutine leak

The Fix:
A single line was added to sessionPool.close():

close(p.multiplexedSessionReq)

This allows the for range loop to exit and the goroutine to terminate cleanly.

Why Tests Are Needed

While the fix was implemented in commit 1805e89, no tests were added at that time. Without test coverage, this bug could silently reappear if:

Code is refactored
The cleanup logic is modified
New features are added that change the session pool lifecycle

Changes

Files Modified

spanner/session_test.go - Added two new test functions

Tests Added

1. `TestSessionPool_CreateMultiplexedSession_NoGoroutineLeak`

Purpose: Verifies that the multiplexed session goroutine is properly cleaned up when the session pool closes.

How it works:

Creates a client with multiplexed sessions enabled (starts the background goroutine)
Sends a request to the multiplexedSessionReq channel to verify it's open and working
Closes the client/pool
Verifies that the channel is properly closed by attempting to receive from it
If the channel is closed (ok == false), the test passes
If the channel is still open or times out, the test fails (indicating a leak)

What it catches:

Goroutine leaks caused by the channel not being closed
The createMultiplexedSession goroutine blocking forever
Memory leaks from leaked goroutines in production applications

2. `TestSessionPool_MultiplexedSessionReqChannelClosed`

Purpose: A complementary test that focuses specifically on channel closure mechanics.

How it works:

Creates a client with multiplexed sessions enabled
Verifies the channel is open by sending a request
Closes the pool
Verifies the channel is closed using receive semantics

Why have both tests:

Test The canonical import path should be google.golang.org/cloud #1 provides comprehensive documentation of the goroutine lifecycle
Test datastore: Support querying multiple kinds #2 is a simpler, focused test on the core channel closure behavior
Multiple tests provide better coverage and catch regressions from different angles

Testing

Running the Tests

# Run just the new tests
go test -race -v -timeout 15m -short -run "TestSessionPool_CreateMultiplexedSession_NoGoroutineLeak|TestSessionPool_MultiplexedSessionReqChannelClosed"

# Run the full test suite
go test -race -v -timeout 15m -short ./...

Test Results

=== RUN   TestSessionPool_CreateMultiplexedSession_NoGoroutineLeak
--- PASS: TestSessionPool_CreateMultiplexedSession_NoGoroutineLeak (0.17s)
=== RUN   TestSessionPool_MultiplexedSessionReqChannelClosed
--- PASS: TestSessionPool_MultiplexedSessionReqChannelClosed (0.07s)
PASS
ok      cloud.google.com/go/spanner    1.934s

Design Decisions

Why Test Channel Closure Instead of Goroutine Count?

Initially, I attempted to verify the fix by counting goroutines before and after closing the pool:

initialGoroutines := runtime.NumGoroutine()
// ... create and close client ...
finalGoroutines := runtime.NumGoroutine()
// Check if goroutines returned to baseline

Problem: This approach was unreliable in test environments due to:

Background goroutines from the test framework
Garbage collector goroutines
Other tests running in parallel (t.Parallel())
Non-deterministic goroutine lifecycle timing

Solution: Test the direct cause (channel closure) rather than the symptom (goroutine count):

When a channel is closed, for range loops over it exit immediately
Testing channel closure is deterministic and reliable
The test directly validates the fix that was implemented

Why Use `select` with Timeout?

select {
case _, ok := <-pool.multiplexedSessionReq:
    if ok {
        t.Fatal("channel is still open")
    }
case <-time.After(100 * time.Millisecond):
    t.Fatal("channel is not closed")
}

This pattern provides:

Immediate feedback if the channel is closed (no blocking)
Timeout protection if the channel is still open but empty (detects the bug)
Clear failure messages for debugging

Impact

What This Prevents

Without these tests:

// Production code with the bug:
func handleRequest() {
    client := createSpannerClient()  // Starts goroutine
    defer client.Close()              // Goroutine leaks!
    // ... use client ...
}

// After 1,000 requests:
// - 1,000 leaked goroutines
// - Increasing memory usage
// - Eventually: Out of memory crash

With these tests:

Any code change that breaks channel closure will fail CI
Developers are immediately alerted to goroutine leaks
Production applications stay healthy

Regression Prevention

These tests serve as executable documentation that:

Explain what the bug was
Demonstrate how the fix works
Prevent the bug from being reintroduced
Help future contributors understand multiplexed session lifecycle

Related Issues

Fixes: fix(spanner): fix createMultiplexedSession goroutine leak #13396 (adds test coverage for the fix)
Related to: Multiplexed sessions feature (#XXXXX - if there's an original feature issue)

Checklist

Tests pass locally
Tests follow existing patterns in session_test.go
Code follows conventional commit message format
Added comprehensive documentation in test comments
Verified tests fail without the fix (by temporarily reverting the close() call)
Verified tests pass with the fix

Additional Notes

This is my first contribution to this repository. I chose to add test coverage for a recent bug fix as a way to:

Learn the codebase
Provide value through regression prevention
Practice Go testing best practices

I'm open to any feedback or suggestions for improvement!

Testing Strategy Reference:

Uses t.Parallel() for concurrent test execution
Uses setupMockedTestServerWithConfig() helper for consistent test setup
Follows existing patterns in the codebase for session pool testing
Includes detailed comments explaining the test logic and rationale

gemini-code-assist · 2026-01-02T03:20:05Z

Summary of Changes

Hello @MASA-JAPAN, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces critical regression tests to address a previously fixed goroutine leak within the Spanner client's multiplexed session management. The tests validate that the multiplexedSessionReq channel is correctly closed when the session pool is terminated, thereby ensuring that the associated background goroutine exits gracefully and prevents resource leaks. This enhancement fortifies the codebase against future regressions of this specific issue.

Highlights

Regression Test for Goroutine Leak: Adds comprehensive regression tests for issue fix(spanner): fix createMultiplexedSession goroutine leak #13396, which involved a goroutine leak in the Spanner client's multiplexed session pool. These tests ensure the createMultiplexedSession goroutine terminates cleanly when the session pool is closed.
Channel Closure Verification: The new tests specifically verify that the multiplexedSessionReq channel is properly closed when the session pool is shut down. This directly addresses the root cause of the previous leak, where the goroutine remained blocked waiting on an unclosed channel.
Two New Test Functions: Introduces two distinct test functions: TestSessionPool_CreateMultiplexedSession_NoGoroutineLeak and TestSessionPool_MultiplexedSessionReqChannelClosed. The first provides a holistic check of goroutine cleanup, while the second offers a focused validation of channel closure mechanics.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request adds valuable regression tests for a previously fixed goroutine leak. The tests correctly focus on verifying channel closure, which is a robust way to confirm the fix. The test descriptions and PR summary are exceptionally clear and well-documented. My review includes a couple of suggestions to improve the new tests: one to remove potentially unnecessary time.Sleep calls for a more robust and faster test, and another to refactor the second test to remove confusing and ineffective defer/recover logic. Overall, this is a great contribution that improves the project's test coverage.

gemini-code-assist · 2026-01-02T03:21:46Z

+	time.Sleep(50 * time.Millisecond)
+
+	// Close the pool
+	client.Close()
+
+	// Give some time for cleanup
+	time.Sleep(100 * time.Millisecond)


These time.Sleep calls can likely be removed to make the test faster and more robust.

The sleep on line 2263 (time.Sleep(50 * time.Millisecond)) seems unnecessary. The preceding select block already confirms that a message can be sent to the channel. The test doesn't depend on the request being fully processed, only that the channel was open, so waiting here isn't required.

The sleep on line 2269 (time.Sleep(100 * time.Millisecond)) is redundant. The final select block (lines 2273-2281) already waits for up to 100ms for the channel to be closed. By having a sleep and a timeout, the test waits longer than necessary. You can rely solely on the select with its timeout to handle any small delay in closing the channel after client.Close() returns.

Removing these sleeps would make the test more efficient without compromising its correctness.

gemini-code-assist · 2026-01-02T03:21:46Z

+	func() {
+		defer func() {
+			if r := recover(); r != nil {
+				t.Fatal("Sending to multiplexedSessionReq panicked, which means channel might not be closed properly or send was attempted incorrectly")
+			}
+		}()
+
+		// Try to receive from the closed channel
+		// A receive from a closed channel returns immediately with zero value
+		select {
+		case _, ok := <-pool.multiplexedSessionReq:
+			if ok {
+				t.Fatal("multiplexedSessionReq channel is still open after pool close")
+			}
+			// Channel is properly closed
+		case <-time.After(100 * time.Millisecond):
+			t.Fatal("multiplexedSessionReq channel is not closed after pool close")
+		}
+	}()


This anonymous function with defer/recover is unnecessary and its logic doesn't align with the implementation.

The recover is set up to handle a panic, which typically occurs when sending to a closed channel.

However, the code inside the select block receives from the channel. Receiving from a channel (whether open or closed) never panics, so the recover block is unreachable.

The select block on its own is a correct and sufficient way to verify if the channel is closed. The surrounding func() wrapper and defer/recover logic can be removed for clarity and correctness. This would also make the comments on lines 2318-2319, which mention panics, obsolete.

I suggest replacing this entire block with just the core select logic.

// Verify the channel is closed by attempting to receive. // A receive from a closed channel returns immediately with a zero value and ok=false. select { case _, ok := <-pool.multiplexedSessionReq: if ok { t.Fatal("multiplexedSessionReq channel is still open after pool close") } // Channel is properly closed. case <-time.After(100 * time.Millisecond): t.Fatal("multiplexedSessionReq channel is not closed after pool close") }

test(spanner): add test coverage for multiplexed session goroutine leak

e9943e7

MASA-JAPAN requested review from a team January 2, 2026 03:19

MASA-JAPAN requested a review from a team as a code owner January 2, 2026 03:19

gemini-code-assist Bot reviewed Jan 2, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Test Coverage for Multiplexed Session Goroutine Leak#13529

Add Test Coverage for Multiplexed Session Goroutine Leak#13529
MASA-JAPAN wants to merge 1 commit intogoogleapis:mainfrom
MASA-JAPAN:test/spanner-multiplexed-session-leak

MASA-JAPAN commented Jan 2, 2026

Uh oh!

gemini-code-assist Bot commented Jan 2, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jan 2, 2026

Uh oh!

gemini-code-assist Bot Jan 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

MASA-JAPAN commented Jan 2, 2026

Summary

Background

The Bug (Issue #13396)

Why Tests Are Needed

Changes

Files Modified

Tests Added

1. TestSessionPool_CreateMultiplexedSession_NoGoroutineLeak

2. TestSessionPool_MultiplexedSessionReqChannelClosed

Testing

Running the Tests

Test Results

Design Decisions

Why Test Channel Closure Instead of Goroutine Count?

Why Use select with Timeout?

Impact

What This Prevents

Regression Prevention

Related Issues

Checklist

Additional Notes

Uh oh!

gemini-code-assist Bot commented Jan 2, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jan 2, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jan 2, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

1. `TestSessionPool_CreateMultiplexedSession_NoGoroutineLeak`

2. `TestSessionPool_MultiplexedSessionReqChannelClosed`

Why Use `select` with Timeout?