Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stack overflow while running CAT #1208

Closed
evan-forbes opened this issue Feb 1, 2024 · 3 comments · Fixed by #1215
Closed

Stack overflow while running CAT #1208

evan-forbes opened this issue Feb 1, 2024 · 3 comments · Fixed by #1215
Assignees
Labels
cat 🐈 T:Bug Type: Bug (confirmed)

Comments

@evan-forbes
Copy link
Member

evan-forbes commented Feb 1, 2024

we are recursively calling findNewPeerToRequest, which can cause a stack overflow when removing a peer

runtime: goroutine stack exceeds 1000000000-byte limit
runtime: sp=0xc095a82398 stack=[0xc095a82000, 0xc0b5a82000]
fatal error: stack overflow

runtime stack:
runtime.throw({0x2434860?, 0x20?})
	/usr/local/go/src/runtime/panic.go:1077 +0x5c fp=0xc000613e18 sp=0xc000613de8 pc=0x43b4fc
runtime.newstack()
	/usr/local/go/src/runtime/stack.go:1107 +0x5ac fp=0xc000613fc8 sp=0xc000613e18 pc=0x455cac
traceback: unexpected SPWRITE function runtime.morestack
runtime.morestack()
	/usr/local/go/src/runtime/asm_amd64.s:593 +0x8f fp=0xc000613fd0 sp=0xc000613fc8 pc=0x46feef

goroutine 68743 [running]:
runtime.mallocgc(0x30?, 0x232d3c0?, 0x1?)
	/usr/local/go/src/runtime/malloc.go:952 +0x825 fp=0xc095a823a8 sp=0xc095a823a0 pc=0x410725
runtime.newobject(0x411b0c?)
	/usr/local/go/src/runtime/malloc.go:1328 +0x25 fp=0xc095a823d0 sp=0xc095a823a8 pc=0x410885
runtime.makemap(0x20c87a0?, 0xc0016c23c0?, 0x416c7c?)
	/usr/local/go/src/runtime/map.go:313 +0x50 fp=0xc095a82418 sp=0xc095a823d0 pc=0x411390
github.com/tendermint/tendermint/mempool/cat.(*SeenTxSet).Get(0xc095a82540?, {0x29, 0x35, 0x62, 0xe1, 0x92, 0xd1, 0x46, 0xf2, 0xc4, ...})
	/go/pkg/mod/github.com/celestiaorg/[email protected]/mempool/cat/cache.go:194 +0xc5 fp=0xc095a824e0 sp=0xc095a82418 pc=0x1a30ce5
github.com/tendermint/tendermint/mempool/cat.(*Reactor).findNewPeerToRequestTx(0xc00165ef50, {0x29, 0x35, 0x62, 0xe1, 0x92, 0xd1, 0x46, 0xf2, 0xc4, ...})
...
...
...
	/go/pkg/mod/github.com/celestiaorg/[email protected]/mempool/cat/reactor.go:483 +0x253 fp=0xc0b5a81660 sp=0xc0b5a81598 pc=0x1a39293
github.com/tendermint/tendermint/mempool/cat.(*Reactor).findNewPeerToRequestTx(0xc00165ef50, {0x29, 0x35, 0x62, 0xe1, 0x92, 0xd1, 0x46, 0xf2, 0xc4, ...})
	/go/pkg/mod/github.com/celestiaorg/[email protected]/mempool/cat/reactor.go:483 +0x253 fp=0xc0b5a81728 sp=0xc0b5a81660 pc=0x1a39293
github.com/tendermint/tendermint/mempool/cat.(*Reactor).findNewPeerToRequestTx(0xc00165ef50, {0x29, 0x35, 0x62, 0xe1, 0x92, 0xd1, 0x46, 0xf2, 0xc4, ...})
	/go/pkg/mod/github.com/celestiaorg/[email protected]/mempool/cat/reactor.go:483 +0x253 fp=0xc0b5a817f0 sp=0xc0b5a81728 pc=0x1a39293
github.com/tendermint/tendermint/mempool/cat.(*Reactor).findNewPeerToRequestTx(0xc00165ef50, {0x29, 0x35, 0x62, 0xe1, 0x92, 0xd1, 0x46, 0xf2, 0xc4, ...})
	/go/pkg/mod/github.com/celestiaorg/[email protected]/mempool/cat/reactor.go:483 +0x253 fp=0xc0b5a818b8 sp=0xc0b5a817f0 pc=0x1a39293
github.com/tendermint/tendermint/mempool/cat.(*Reactor).findNewPeerToRequestTx(0xc00165ef50, {0x29, 0x35, 0x62, 0xe1, 0x92, 0xd1, 0x46, 0xf2, 0xc4, ...})
	/go/pkg/mod/github.com/celestiaorg/[email protected]/mempool/cat/reactor.go:483 +0x253 fp=0xc0b5a81980 sp=0xc0b5a818b8 pc=0x1a39293
github.com/tendermint/tendermint/mempool/cat.(*Reactor).findNewPeerToRequestTx(0xc00165ef50, {0x29, 0x35, 0x62, 0xe1, 0x92, 0xd1, 0x46, 0xf2, 0xc4, ...})
	/go/pkg/mod/github.com/celestiaorg/[email protected]/mempool/cat/reactor.go:483 +0x253 fp=0xc0b5a81a48 sp=0xc0b5a81980 pc=0x1a39293
github.com/tendermint/tendermint/mempool/cat.(*Reactor).findNewPeerToRequestTx(0xc00165ef50, {0x29, 0x35, 0x62, 0xe1, 0x92, 0xd1, 0x46, 0xf2, 0xc4, ...})
	/go/pkg/mod/github.com/celestiaorg/[email protected]/mempool/cat/reactor.go:483 +0x253 fp=0xc0b5a81b10 sp=0xc0b5a81a48 pc=0x1a39293
github.com/tendermint/tendermint/mempool/cat.(*Reactor).RemovePeer(0xc00165ef50, {0x2fbd960?, 0xc0011ce270?}, {0x2?, 0xc040505500?})
	/go/pkg/mod/github.com/celestiaorg/[email protected]/mempool/cat/reactor.go:197 +0x105 fp=0xc0b5a81bc8 sp=0xc0b5a81b10 pc=0x1a36f45
github.com/tendermint/tendermint/p2p.(*Switch).stopAndRemovePeer(0xc0015a8900, {0x2fbd960, 0xc0011ce270}, {0x2199ea0, 0xc0416a91d0})
	/go/pkg/mod/github.com/celestiaorg/[email protected]/p2p/switch.go:406 +0x179 fp=0xc0b5a81c90 sp=0xc0b5a81bc8 pc=0xa2fa99
github.com/tendermint/tendermint/p2p.(*Switch).StopPeerForError(0xc0015a8900, {0x2fbd960, 0xc0011ce270}, {0x2199ea0?, 0xc0416a91d0})
	/go/pkg/mod/github.com/celestiaorg/[email protected]/p2p/switch.go:373 +0x12e fp=0xc0b5a81cf0 sp=0xc0b5a81c90 pc=0xa2f62e
github.com/tendermint/tendermint/p2p.(*Switch).StopPeerForError-fm({0x2fbd960?, 0xc0011ce270?}, {0x2199ea0?, 0xc0416a91d0?})
	<autogenerated>:1 +0x45 fp=0xc0b5a81d28 sp=0xc0b5a81cf0 pc=0xa3a085
github.com/tendermint/tendermint/p2p.createMConnection.func2({0x2199ea0?, 0xc0416a91d0?})
	/go/pkg/mod/github.com/celestiaorg/[email protected]/p2p/peer.go:561 +0x39 fp=0xc0b5a81d58 sp=0xc0b5a81d28 pc=0xa2c079
github.com/tendermint/tendermint/p2p/conn.(*MConnection).stopForError(0xc050028580, {0x2199ea0, 0xc0416a91d0})
	/go/pkg/mod/github.com/celestiaorg/[email protected]/p2p/conn/connection.go:345 +0x102 fp=0xc0b5a81da8 sp=0xc0b5a81d58 pc=0x9f2e22
github.com/tendermint/tendermint/p2p/conn.(*MConnection).sendPacketMsg(0xc050028580)
	/go/pkg/mod/github.com/celestiaorg/[email protected]/p2p/conn/connection.go:548 +0x248 fp=0xc0b5a81e40 sp=0xc0b5a81da8 pc=0x9f4348
github.com/tendermint/tendermint/p2p/conn.(*MConnection).sendSomePacketMsgs(0xc050028580)
	/go/pkg/mod/github.com/celestiaorg/[email protected]/p2p/conn/connection.go:512 +0x55 fp=0xc0b5a81e78 sp=0xc0b5a81e40 pc=0x9f40d5
github.com/tendermint/tendermint/p2p/conn.(*MConnection).sendRoutine(0xc050028580)
	/go/pkg/mod/github.com/celestiaorg/[email protected]/p2p/conn/connection.go:477 +0x62c fp=0xc0b5a81fc8 sp=0xc0b5a81e78 pc=0x9f3dcc
github.com/tendermint/tendermint/p2p/conn.(*MConnection).OnStart.func1()
	/go/pkg/mod/github.com/celestiaorg/[email protected]/p2p/conn/connection.go:231 +0x25 fp=0xc0b5a81fe0 sp=0xc0b5a81fc8 pc=0x9f2685
runtime.goexit()
	/usr/local/go/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc0b5a81fe8 sp=0xc0b5a81fe0 pc=0x471c01
created by github.com/tendermint/tendermint/p2p/conn.(*MConnection).OnStart in goroutine 63749
	/go/pkg/mod/github.com/celestiaorg/[email protected]/p2p/conn/connection.go:231 +0x21b

goroutine 1 [select, 9 minutes]:
runtime.gopark(0xc00076dd98?, 0x2?, 0x0?, 0x0?, 0xc00076dcc4?)
	/usr/local/go/src/runtime/proc.go:398 +0xce fp=0xc0018abb30 sp=0xc0018abb10 pc=0x43e34e
runtime.selectgo(0xc0018abd98, 0xc00076dcc0, 0xc0008e9470?, 0x0, 0xc0008f21e0?, 0x1)
	/usr/local/go/src/runtime/select.go:327 +0x725 fp=0xc0018abc50 sp=0xc0018abb30 pc=0x44e7e5
github.com/testground/sdk-go/run.invoke(0xc0005c7c80, {0x207ae00?, 0x2af4270})
	/go/pkg/mod/github.com/testground/[email protected]/run/invoker.go:180 +0x795 fp=0xc0018abeb0 sp=0xc0018abc50 pc=0x8f4e75
github.com/testground/sdk-go/run.InvokeMap(0xc0000061a0?)
	/go/pkg/mod/github.com/testground/[email protected]/run/invoker.go:77 +0x85 fp=0xc0018abf28 sp=0xc0018abeb0 pc=0x8f45e5
main.main()
	/plan/main.go:13 +0x1a fp=0xc0018abf40 sp=0xc0018abf28 pc=0x1ed799a
runtime.main()
	/usr/local/go/src/runtime/proc.go:267 +0x2bb fp=0xc0018abfe0 sp=0xc0018abf40 pc=0x43dedb
runtime.goexit()
	/usr/local/go/src/runtime/asm_amd64.s:1650 +0x1 fp=0xc0018abfe8 sp=0xc0018abfe0 pc=0x471c01
@evan-forbes evan-forbes added T:Bug Type: Bug (confirmed) cat 🐈 labels Feb 1, 2024
@evan-forbes evan-forbes self-assigned this Feb 2, 2024
@cmwaters
Copy link
Contributor

cmwaters commented Feb 6, 2024

Interesting find. Theoretically we should only be recursively calling this for the amount of peers we are connected to and no more. Will need to further investigate

@evan-forbes
Copy link
Member Author

yeah, I'm still questioning how this gets hit

this did fix it tho

f1ca3f5

the first test introduced there will fail on main with the same recursive call, however I don't think that scenario should ever actually occur

@evan-forbes
Copy link
Member Author

evan-forbes commented Feb 6, 2024

without this line in the first test

delete(reactor.mempool.seenByPeersSet.set[wantedTx.Key()].peers, reactor.ids.GetIDForPeer(peers[1].ID()))

the test on main will continually call findNewPeerToRequestTx until the seenTxs map randomly returns a different peer. With that line, it will return the first peer every time, resulting in an infinit loop

@cmwaters cmwaters self-assigned this Feb 6, 2024
cmwaters added a commit that referenced this issue Feb 6, 2024
## Description

Closes: #1208

---

#### PR checklist

- [ ] Tests written/updated
- [ ] Changelog entry added in `.changelog` (we use
[unclog](https://github.com/informalsystems/unclog) to manage our
changelog)
- [ ] Updated relevant documentation (`docs/` or `spec/`) and code
comments
mergify bot pushed a commit that referenced this issue Feb 6, 2024
## Description

Closes: #1208

---

#### PR checklist

- [ ] Tests written/updated
- [ ] Changelog entry added in `.changelog` (we use
[unclog](https://github.com/informalsystems/unclog) to manage our
changelog)
- [ ] Updated relevant documentation (`docs/` or `spec/`) and code
comments

(cherry picked from commit 52b993c)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cat 🐈 T:Bug Type: Bug (confirmed)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants