Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disconnected context in the identify protocol #1030

Closed
MichaelMure opened this issue Dec 16, 2020 · 8 comments
Closed

Disconnected context in the identify protocol #1030

MichaelMure opened this issue Dec 16, 2020 · 8 comments
Assignees
Labels
exp/novice Someone with a little familiarity can pick up kind/bug A bug in existing code (including security flaws) status/ready Ready to be worked

Comments

@MichaelMure
Copy link

Unless I'm missing something, the identify protocol should inherit from the host's context, which would then ensure that its main loop end when the host is terminated.

At the moment this main loop is left running after the host's context get cancelled.

See https://github.com/libp2p/go-libp2p/blob/master/p2p/protocol/identify/id.go#L130

@MichaelMure MichaelMure added the kind/bug A bug in existing code (including security flaws) label Dec 16, 2020
@MichaelMure
Copy link
Author

Alternatively and similar to #1031, Close() is not called properly when the host is terminated. What I found is that this goroutine is left hanging after the host's context get cancelled.

@aarshkshah1992 aarshkshah1992 self-assigned this Dec 18, 2020
@aarshkshah1992
Copy link
Contributor

aarshkshah1992 commented Dec 18, 2020

@MichaelMure We shut down the Identify service when the Host is shut down over here:

h.ids.Close()

The call to Identify.Close() will cancel the context that Identify's loop waits for:

ids.ctxCancel()

Let me know if you have concerns otherwise otherwise I can close this.

Note: We moved to this pattern as we've run into problems propagating and using contexts for closing services. You can read our discussions here:

libp2p/go-libp2p-kbucket#50 (comment)
libp2p/go-libp2p-kbucket#50 (comment)

@MichaelMure
Copy link
Author

Here is what I see:

func TestFoo(t *testing.T) {
	defer goleak.VerifyNone(t)

	ctx, cancel := context.WithCancel(context.Background())

	_, err := libp2p.New(ctx,
		libp2p.ListenAddrStrings("/ip4/0.0.0.0/tcp/1234"),
	)
	require.NoError(t, err)

	time.Sleep(10 * time.Second)

	cancel()

	time.Sleep(10 * time.Second)
}
=== RUN   TestFoo
    leaks.go:78: found unexpected goroutines:
        [Goroutine 21 in state select, with github.com/ipfs/go-log/writer.(*MirrorWriter).logRoutine on top of the stack:
        goroutine 21 [select]:
        github.com/ipfs/go-log/writer.(*MirrorWriter).logRoutine(0xc0000b5e30)
        	/home/michael/go/pkg/mod/github.com/ipfs/[email protected]/writer/writer.go:71 +0x128
        created by github.com/ipfs/go-log/writer.NewMirrorWriter
        	/home/michael/go/pkg/mod/github.com/ipfs/[email protected]/writer/writer.go:36 +0xb9
        
         Goroutine 22 in state select, with go.opencensus.io/stats/view.(*worker).start on top of the stack:
        goroutine 22 [select]:
        go.opencensus.io/stats/view.(*worker).start(0xc000146080)
        	/home/michael/go/pkg/mod/[email protected]/stats/view/worker.go:276 +0x100
        created by go.opencensus.io/stats/view.init.0
        	/home/michael/go/pkg/mod/[email protected]/stats/view/worker.go:34 +0x68
        
         Goroutine 5 in state select, with github.com/libp2p/go-libp2p-peerstore/pstoremem.(*memoryAddrBook).background on top of the stack:
        goroutine 5 [select]:
        github.com/libp2p/go-libp2p-peerstore/pstoremem.(*memoryAddrBook).background(0xc00012a000)
        	/home/michael/go/pkg/mod/github.com/libp2p/[email protected]/pstoremem/addr_book.go:93 +0x126
        created by github.com/libp2p/go-libp2p-peerstore/pstoremem.NewAddrBook
        	/home/michael/go/pkg/mod/github.com/libp2p/[email protected]/pstoremem/addr_book.go:83 +0x18e
        
         Goroutine 10 in state select, with github.com/libp2p/go-libp2p/p2p/protocol/identify.(*IDService).loop on top of the stack:
        goroutine 10 [select]:
        github.com/libp2p/go-libp2p/p2p/protocol/identify.(*IDService).loop(0xc0000b6b60)
        	/home/michael/go/pkg/mod/github.com/libp2p/[email protected]/p2p/protocol/identify/id.go:204 +0x4f7
        created by github.com/libp2p/go-libp2p/p2p/protocol/identify.NewIDService
        	/home/michael/go/pkg/mod/github.com/libp2p/[email protected]/p2p/protocol/identify/id.go:151 +0x2c3
        ]
--- FAIL: TestFoo (20.58s)
FAIL

Leaving aside opencensus and go-log, you can see that Identify and Peerstore are left around. Now admittedly, Close() could/should be used, but then why accepting a context?

Also, when actually using Close(), the Peerstore is still left around:

         Goroutine 11 in state select, with github.com/libp2p/go-libp2p-peerstore/pstoremem.(*memoryAddrBook).background on top of the stack:
        goroutine 11 [select]:
        github.com/libp2p/go-libp2p-peerstore/pstoremem.(*memoryAddrBook).background(0xc000170000)
        	/home/michael/go/pkg/mod/github.com/libp2p/[email protected]/pstoremem/addr_book.go:93 +0x126
        created by github.com/libp2p/go-libp2p-peerstore/pstoremem.NewAddrBook
        	/home/michael/go/pkg/mod/github.com/libp2p/[email protected]/pstoremem/addr_book.go:83 +0x18e

Just saw your edit. I understand now using the rational of using Close() instead of context cancelling, but then why accept a context in the first place as it expose to this incorrect termination? Is that something that will be removed? Maybe mention in the doc that the context is not to be used for closing?

@jacobheun
Copy link
Contributor

@aarshkshah1992 can you take a look at this again?

@aarshkshah1992
Copy link
Contributor

@jacobheun Taking a look

@aarshkshah1992
Copy link
Contributor

@MichaelMure

Unfortunately, we haven't finished migrating completely from using the context to the Close method. I've added documentation regarding using the Close method at #1037.

That PR also introduces a change to close the PeerStore.

@aschmahmann aschmahmann added status/ready Ready to be worked exp/novice Someone with a little familiarity can pick up labels Apr 12, 2021
@BigLep
Copy link
Contributor

BigLep commented Aug 20, 2021

@aarshkshah1992 : at the minimum as part of your maintenance week can you either handle this or make it clear what the next steps are please?

@marten-seemann
Copy link
Contributor

We moved away from using contexts for service shutdown, so I believe this issue has become stale.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
exp/novice Someone with a little familiarity can pick up kind/bug A bug in existing code (including security flaws) status/ready Ready to be worked
Projects
None yet
Development

No branches or pull requests

7 participants